BUT System Description for The Third DIHARD Speech Diarization Challenge


F Landini, A Lozano-Diez, L Burget, M Diez, A Silnova


Brno University of Technology, Faculty of Information Technology, Speech@FIT, Czechia
Omilia – Conversational Intelligence, Athens, Greece

This is the system description corresponding to the systems developed by the BUT team for The Third DIHARD Speech Diarization Challenge. The systems for both tracks consist of a DOVERlap fusion of an end-to-end NN system with xvector based clustering systems in the form of spectral clustering and VBx. Given that the x-vector clustering systems do not provide overlapping speakers, overlapped speech is detected by a TasNet-based detector before the final fusion with the end-to-end approach.