BUT System Description for The Third DIHARD Speech Diarization Challenge

Authors:

F Landini, A Lozano-Diez, L Burget, M Diez, A Silnova

Publication Date

2021

This is the system description corresponding to the systems developed by the BUT team for The Third DIHARD Speech Diarization Challenge. The systems for both tracks consist of a DOVERlap fusion of an end-to-end NN system with xvector based clustering systems in the form of spectral clustering and VBx. Given that the x-vector clustering systems do not provide overlapping speakers, overlapped speech is detected by a TasNet-based detector before the final fusion with the end-to-end approach.