THUEE system description for NIST 2020 SRE CTS challenge
- URL: http://arxiv.org/abs/2210.06111v1
- Date: Wed, 12 Oct 2022 12:01:59 GMT
- Title: THUEE system description for NIST 2020 SRE CTS challenge
- Authors: Yu Zheng, Jinghan Peng, Miao Zhao, Yufeng Ma, Min Liu, Xinyue Ma,
Tianyu Liang, Tianlong Kong, Liang He, Minqiang Xu
- Abstract summary: This paper presents the system description of the THUEE team for the NIST 2020 Speaker Recognition Evaluation (SRE) conversational telephone speech (CTS) challenge.
The subsystems including ResNet74, ResNet152, and RepVGG-B2 are developed as speaker embedding extractors in this evaluation.
- Score: 19.2916501364633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the system description of the THUEE team for the NIST
2020 Speaker Recognition Evaluation (SRE) conversational telephone speech (CTS)
challenge. The subsystems including ResNet74, ResNet152, and RepVGG-B2 are
developed as speaker embedding extractors in this evaluation. We used combined
AM-Softmax and AAM-Softmax based loss functions, namely CM-Softmax. We adopted
a two-staged training strategy to further improve system performance. We fused
all individual systems as our final submission. Our approach leads to excellent
performance and ranks 1st in the challenge.
Related papers
- Robust, General, and Low Complexity Acoustic Scene Classification
Systems and An Effective Visualization for Presenting a Sound Scene Context [53.80051967863102]
We present a comprehensive analysis of Acoustic Scene Classification (ASC)
We propose an inception-based and low footprint ASC model, referred to as the ASC baseline.
Next, we improve the ASC baseline by proposing a novel deep neural network architecture.
arXiv Detail & Related papers (2022-10-16T19:07:21Z) - The NIST CTS Speaker Recognition Challenge [1.5282767384702267]
The US National Institute of Standards and Technology (NIST) has been conducting a second iteration of the CTS Challenge since August 2020.
This paper presents an overview of the evaluation and several analyses of system performance for some primary conditions in the CTS Challenge.
arXiv Detail & Related papers (2022-04-21T16:06:27Z) - Wider or Deeper Neural Network Architecture for Acoustic Scene
Classification with Mismatched Recording Devices [59.86658316440461]
We present a robust and low complexity system for Acoustic Scene Classification (ASC)
We first construct an ASC baseline system in which a novel inception-residual-based network architecture is proposed to deal with the mismatched recording device issue.
To further improve the performance but still satisfy the low complexity model, we apply two techniques: ensemble of multiple spectrograms and channel reduction.
arXiv Detail & Related papers (2022-03-23T10:27:41Z) - STC speaker recognition systems for the NIST SRE 2021 [56.05258832139496]
This paper presents a description of STC Ltd. systems submitted to the NIST 2021 Speaker Recognition Evaluation.
These systems consists of a number of diverse subsystems based on using deep neural networks as feature extractors.
For video modality we developed our best solution with RetinaFace face detector and deep ResNet face embeddings extractor trained on large face image datasets.
arXiv Detail & Related papers (2021-11-03T15:31:01Z) - The USYD-JD Speech Translation System for IWSLT 2021 [85.64797317290349]
This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task.
We trained our models with the officially provided ASR and MT datasets.
To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning.
arXiv Detail & Related papers (2021-07-24T09:53:34Z) - Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System [76.22810715401147]
We propose new joint system-wise optimization techniques for the pipeline dialog system.
First, we propose a new data augmentation approach which automates the labeling process for NLU training.
Second, we propose a novel policy parameterization with Poisson distribution that enables better exploration and offers a way to compute policy gradient.
arXiv Detail & Related papers (2021-06-09T06:44:57Z) - USTC-NELSLIP System Description for DIHARD-III Challenge [78.40959509760488]
The innovation of our system lies in the combination of various front-end techniques to solve the diarization problem.
Our best system achieved DERs of 11.30% in track 1 and 16.78% in track 2 on evaluation set.
arXiv Detail & Related papers (2021-03-19T07:00:51Z) - Tongji University Undergraduate Team for the VoxCeleb Speaker
Recognition Challenge2020 [10.836635938778684]
We applied the RSBU-CW module to the ResNet34 framework to improve the denoising ability of the network.
We trained two variants of ResNet,used score fusion and data-augmentation methods to improve the performance of the model.
arXiv Detail & Related papers (2020-10-20T09:25:40Z) - The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR
Challenge [13.232899176888575]
This paper describes the Interspeech 2020 Non-Native Children's Speech ASR Challenge supported by the SIG-CHILD group of ISCA.
All participants were restricted to develop their systems merely based on the speech and text corpora provided by the organizer.
To work around this under-resourced issue, we built our ASR system on top of CNN-TDNNF-based acoustic models.
arXiv Detail & Related papers (2020-05-18T02:51:26Z) - LEAP System for SRE19 CTS Challenge -- Improvements and Error Analysis [36.35711634925221]
We provide a detailed account of the LEAP SRE system submitted to the CTS challenge.
All the systems used the time-delay neural network (TDNN) based x-vector embeddings.
The system combination of generative and neural PLDA models resulted in significant improvements for the SRE evaluation dataset.
arXiv Detail & Related papers (2020-02-07T12:28:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.