The 2022 NIST Language Recognition Evaluation
- URL: http://arxiv.org/abs/2302.14624v1
- Date: Tue, 28 Feb 2023 15:05:33 GMT
- Title: The 2022 NIST Language Recognition Evaluation
- Authors: Yooyoung Lee, Craig Greenberg, Eliot Godard, Asad A. Butt, Elliot
Singer, Trang Nguyen, Lisa Mason, Douglas Reynolds
- Abstract summary: In 2022, the U.S. National Institute of Standards and Technology (NIST) conducted the latest Language Recognition Evaluation (LRE)
Similar to previous LREs, LRE22 focused on conversational telephone speech (CTS) and broadcast narrowband speech (BNBS) data.
This paper presents an overview of LRE22 and an analysis of system performance over different evaluation conditions.
- Score: 1.3730035576297057
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In 2022, the U.S. National Institute of Standards and Technology (NIST)
conducted the latest Language Recognition Evaluation (LRE) in an ongoing series
administered by NIST since 1996 to foster research in language recognition and
to measure state-of-the-art technology. Similar to previous LREs, LRE22 focused
on conversational telephone speech (CTS) and broadcast narrowband speech (BNBS)
data. LRE22 also introduced new evaluation features, such as an emphasis on
African languages, including low resource languages, and a test set consisting
of segments containing between 3s and 35s of speech randomly sampled and
extracted from longer recordings. A total of 21 research organizations, forming
16 teams, participated in this 3-month long evaluation and made a total of 65
valid system submissions to be evaluated. This paper presents an overview of
LRE22 and an analysis of system performance over different evaluation
conditions. The evaluation results suggest that Oromo and Tigrinya are easier
to detect while Xhosa and Zulu are more challenging. A greater confusability is
seen for some language pairs. When speech duration increased, system
performance significantly increased up to a certain duration, and then a
diminishing return on system performance is observed afterward.
Related papers
- TTSDS -- Text-to-Speech Distribution Score [9.380879437204277]
Many recently published Text-to-Speech (TTS) systems produce audio close to real speech.
We propose evaluating the quality of synthetic speech as a combination of multiple factors such as prosody, speaker identity, and intelligibility.
We benchmark 35 TTS systems developed between 2008 and 2024 and show that our score computed as an unweighted average of factors strongly correlates with the human evaluations.
arXiv Detail & Related papers (2024-07-17T16:30:27Z) - Morphosyntactic Analysis for CHILDES [1.6258710071587594]
We have been transcribing and linking data for the CHILDES database.
We have applied the UD (Universal Dependencies) framework to provide a consistent and comparable morphosyntactic analysis for 27 languages.
arXiv Detail & Related papers (2024-07-17T08:11:24Z) - An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios [76.11409260727459]
This paper explores the language adaptation capability of ZMM-TTS, a recent SSL-based multilingual TTS system.
We demonstrate that the similarity in phonetics between the pre-training and target languages, as well as the language category, affects the target language's adaptation performance.
arXiv Detail & Related papers (2024-06-13T08:16:52Z) - LLaMA Beyond English: An Empirical Study on Language Capability Transfer [49.298360366468934]
We focus on how to effectively transfer the capabilities of language generation and following instructions to a non-English language.
We analyze the impact of key factors such as vocabulary extension, further pretraining, and instruction tuning on transfer.
We employ four widely used standardized testing benchmarks: C-Eval, MMLU, AGI-Eval, and GAOKAO-Bench.
arXiv Detail & Related papers (2024-01-02T06:29:02Z) - Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation
over More Languages and Beyond [89.54151859266202]
The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework.
The challenge garnered 12 model submissions and 54 language corpora, resulting in a comprehensive benchmark encompassing 154 languages.
The findings indicate that merely scaling models is not the definitive solution for multilingual speech tasks.
arXiv Detail & Related papers (2023-10-09T08:30:01Z) - KIT's Multilingual Speech Translation System for IWSLT 2023 [58.5152569458259]
We describe our speech translation system for the multilingual track of IWSLT 2023.
The task requires translation into 10 languages of varying amounts of resources.
Our cascaded speech system substantially outperforms its end-to-end counterpart on scientific talk translation.
arXiv Detail & Related papers (2023-06-08T16:13:20Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - The 2021 NIST Speaker Recognition Evaluation [1.5282767384702267]
The 2021 Speaker Recognition Evaluation (SRE21) was the latest cycle of the ongoing evaluation series conducted by the U.S. National Institute of Standards and Technology (NIST) since 1996.
This paper presents an overview of SRE21 including the tasks, performance metric, data, evaluation protocol, results and system performance analyses.
arXiv Detail & Related papers (2022-04-21T16:18:52Z) - Leveraging neural representations for facilitating access to
untranscribed speech from endangered languages [10.61744395262441]
We use data selected from 7 Australian Aboriginal languages and a regional variety of Dutch.
We find that representations from the middle layers of the wav2vec 2.0 Transformer offer large gains in task performance.
While features extracted using the pre-trained English model yielded improved detection on all the evaluation languages, better detection performance was associated with the evaluation language's phonological similarity to English.
arXiv Detail & Related papers (2021-03-26T16:44:08Z) - Arabic Speech Recognition by End-to-End, Modular Systems and Human [56.96327247226586]
We perform a comprehensive benchmarking for end-to-end transformer ASR, modular HMM-DNN ASR, and human speech recognition.
For ASR the end-to-end work led to 12.5%, 27.5%, 23.8% WER; a new performance milestone for the MGB2, MGB3, and MGB5 challenges respectively.
Our results suggest that human performance in the Arabic language is still considerably better than the machine with an absolute WER gap of 3.6% on average.
arXiv Detail & Related papers (2021-01-21T05:55:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.