The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker
Diarisation Challenge
- URL: http://arxiv.org/abs/2010.11657v2
- Date: Fri, 23 Oct 2020 07:45:47 GMT
- Title: The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker
Diarisation Challenge
- Authors: Renyu Wang, Ruilin Tong, Yu Ting Yeung, Xiao Chen
- Abstract summary: This paper describes system setup of our submission to speaker diarisation track (Track 4) of VoxCeleb Speaker Recognition Challenge 2020.
Our diarisation system consists of a well-trained neural network based speech enhancement model as pre-processing front-end of input speech signals.
- Score: 6.6238321827660345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes system setup of our submission to speaker diarisation
track (Track 4) of VoxCeleb Speaker Recognition Challenge 2020. Our diarisation
system consists of a well-trained neural network based speech enhancement model
as pre-processing front-end of input speech signals. We replace conventional
energy-based voice activity detection (VAD) with a neural network based VAD.
The neural network based VAD provides more accurate annotation of speech
segments containing only background music, noise, and other interference, which
is crucial to diarisation performance. We apply agglomerative hierarchical
clustering (AHC) of x-vectors and variational Bayesian hidden Markov model
(VB-HMM) based iterative clustering for speaker clustering. Experimental
results demonstrate that our proposed system achieves substantial improvements
over the baseline system, yielding diarisation error rate (DER) of 10.45%, and
Jacard error rate (JER) of 22.46% on the evaluation set.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.