Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation
over More Languages and Beyond
- URL: http://arxiv.org/abs/2310.05513v1
- Date: Mon, 9 Oct 2023 08:30:01 GMT
- Title: Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation
over More Languages and Beyond
- Authors: Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping
Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li,
Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe
- Abstract summary: The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework.
The challenge garnered 12 model submissions and 54 language corpora, resulting in a comprehensive benchmark encompassing 154 languages.
The findings indicate that merely scaling models is not the definitive solution for multilingual speech tasks.
- Score: 89.54151859266202
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB)
Challenge expands upon the acclaimed SUPERB framework, emphasizing
self-supervised models in multilingual speech recognition and language
identification. The challenge comprises a research track focused on applying
ML-SUPERB to specific multilingual subjects, a Challenge Track for model
submissions, and a New Language Track where language resource researchers can
contribute and evaluate their low-resource language data in the context of the
latest progress in multilingual speech recognition. The challenge garnered 12
model submissions and 54 language corpora, resulting in a comprehensive
benchmark encompassing 154 languages. The findings indicate that merely scaling
models is not the definitive solution for multilingual speech tasks, and a
variety of speech/voice types present significant challenges in multilingual
speech processing.
Related papers
- Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training [29.47243668154796]
BLOOMZMMS is a novel model that integrates a multilingual LLM with a multilingual speech encoder.
We demonstrate the transferability of linguistic knowledge from the text to the speech modality.
Our zero-shot evaluation results confirm the robustness of our approach across multiple tasks.
arXiv Detail & Related papers (2024-04-16T21:45:59Z) - Multilingual Text Representation [3.4447129363520337]
Modern NLP breakthrough includes large multilingual models capable of performing tasks across more than 100 languages.
State-of-the-art language models came a long way, starting from the simple one-hot representation of words.
We discuss how the full potential of language democratization could be obtained, reaching beyond the known limits.
arXiv Detail & Related papers (2023-09-02T14:21:22Z) - ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text
Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models.
Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z) - ML-SUPERB: Multilingual Speech Universal PERformance Benchmark [73.65853301350042]
Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to benchmark the performance of Self-Supervised Learning (SSL) models on various speech processing tasks.
This paper presents multilingual SUPERB, covering 143 languages (ranging from high-resource to endangered), and considering both automatic speech recognition and language identification.
Similar to the SUPERB benchmark, we find speech SSL models can significantly improve performance compared to FBANK features.
arXiv Detail & Related papers (2023-05-18T00:01:27Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - xGQA: Cross-Lingual Visual Question Answering [100.35229218735938]
xGQA is a new multilingual evaluation benchmark for the visual question answering task.
We extend the established English GQA dataset to 7 typologically diverse languages.
We propose new adapter-based approaches to adapt multimodal transformer-based models to become multilingual.
arXiv Detail & Related papers (2021-09-13T15:58:21Z) - Multilingual Byte2Speech Text-To-Speech Models Are Few-shot Spoken
Language Learners [11.190877290770047]
We present a multilingual end-to-end Text-To-Speech framework that maps byte inputs to spectrograms, thus allowing arbitrary input scripts.
The framework demonstrates capabilities to adapt to various new languages under extreme low-resource scenarios.
We propose a novel method to extract language-specific sub-networks for a better understanding of the mechanism of multilingual models.
arXiv Detail & Related papers (2021-03-05T08:41:45Z) - Multilingual AMR-to-Text Generation [22.842874899794996]
We create multilingual AMR-to-text models that generate in twenty one different languages.
For eighteen languages, based on automatic metrics, our multilingual models surpass baselines that generate into a single language.
We analyse the ability of our multilingual models to accurately capture morphology and word order using human evaluation, and find that native speakers judge our generations to be fluent.
arXiv Detail & Related papers (2020-11-10T22:47:14Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.