Oriental Language Recognition (OLR) 2020: Summary and Analysis
- URL: http://arxiv.org/abs/2107.05365v1
- Date: Mon, 5 Jul 2021 12:42:40 GMT
- Title: Oriental Language Recognition (OLR) 2020: Summary and Analysis
- Authors: Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong,
Dong Wang
- Abstract summary: The fifth Oriental Language Recognition (OLR) Challenge focuses on language recognition in a variety of complex environments.
This paper describes the three tasks, the database profile, and the final results of the challenge.
- Score: 21.212345251874513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The fifth Oriental Language Recognition (OLR) Challenge focuses on language
recognition in a variety of complex environments to promote its development.
The OLR 2020 Challenge includes three tasks: (1) cross-channel language
identification, (2) dialect identification, and (3) noisy language
identification. We choose Cavg as the principle evaluation metric, and the
Equal Error Rate (EER) as the secondary metric. There were 58 teams
participating in this challenge and one third of the teams submitted valid
results. Compared with the best baseline, the Cavg values of Top 1 system for
the three tasks were relatively reduced by 82%, 62% and 48%, respectively. This
paper describes the three tasks, the database profile, and the final results.
We also outline the novel approaches that improve the performance of language
recognition systems most significantly, such as the utilization of auxiliary
information.
Related papers
- IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named
Entity Recognition using Knowledge Bases [53.054598423181844]
We present a novel NER cascade approach comprising three steps.
We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities.
Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting.
arXiv Detail & Related papers (2023-04-20T20:30:34Z) - Lila: A Unified Benchmark for Mathematical Reasoning [59.97570380432861]
LILA is a unified mathematical reasoning benchmark consisting of 23 diverse tasks along four dimensions.
We construct our benchmark by extending 20 datasets benchmark by collecting task instructions and solutions in the form of Python programs.
We introduce BHASKARA, a general-purpose mathematical reasoning model trained on LILA.
arXiv Detail & Related papers (2022-10-31T17:41:26Z) - Cross-Lingual Speaker Identification Using Distant Supervision [84.51121411280134]
We propose a speaker identification framework that addresses issues such as lack of contextual reasoning and poor cross-lingual generalization.
We show that the resulting model outperforms previous state-of-the-art methods on two English speaker identification benchmarks by up to 9% in accuracy and 5% with only distant supervision.
arXiv Detail & Related papers (2022-10-11T20:49:44Z) - Making Large Language Models Better Reasoners with Step-Aware Verifier [49.16750018427259]
DIVERSE (Diverse Verifier on Reasoning Step) is a novel approach that further enhances the reasoning capability of language models.
We evaluate DIVERSE on the latest language model code-davinci and show that it achieves new state-of-the-art results on six of eight reasoning benchmarks.
arXiv Detail & Related papers (2022-06-06T03:38:36Z) - EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based
Enrolment, Verification, and Identification [49.77911492230467]
We formalise the three authentication tasks and their evaluation protocols.
We present EVI, a challenging spoken multilingual dataset with 5,506 dialogues in English, Polish, and French.
arXiv Detail & Related papers (2022-04-28T13:39:24Z) - Multilingual Speech Recognition using Knowledge Transfer across Learning
Processes [15.927513451432946]
Experimental results reveal the best pre-training strategy resulting in 3.55% relative reduction in overall WER.
A combination of LEAP and SSL yields 3.51% relative reduction in overall WER when using language ID.
arXiv Detail & Related papers (2021-10-15T07:50:27Z) - OLR 2021 Challenge: Datasets, Rules and Baselines [23.878103387338918]
The data profile, four tasks, two baselines, and the evaluation principles are introduced in this paper.
In addition to the Language Identification (LID) tasks, multilingual Automatic Speech Recognition (ASR) tasks are introduced to OLR 2021 Challenge for the first time.
arXiv Detail & Related papers (2021-07-23T09:57:29Z) - Cross-lingual Extended Named Entity Classification of Wikipedia Articles [0.0]
This paper describes our method to solving the problem and discusses the official results.
We propose a three-stage approach including multilingual model pre-training, monolingual model fine-tuning and cross-lingual voting.
Our system is able to achieve the best scores for 25 out of 30 languages; and its accuracy gaps to the best performing systems of the other five languages are relatively small.
arXiv Detail & Related papers (2020-10-07T14:06:09Z) - AP20-OLR Challenge: Three Tasks and Their Baselines [29.652143329022817]
The data profile, three tasks, the corresponding baselines, and the evaluation principles are introduced in this paper.
The AP20-OLR challenge includes more languages, dialects and real-life data provided by Speechocean and the NSFC M2ASR project.
arXiv Detail & Related papers (2020-06-04T16:29:21Z) - Learning to Learn Morphological Inflection for Resource-Poor Languages [105.11499402984482]
We propose to cast the task of morphological inflection - mapping a lemma to an indicated inflected form - for resource-poor languages as a meta-learning problem.
Treating each language as a separate task, we use data from high-resource source languages to learn a set of model parameters.
Experiments with two model architectures on 29 target languages from 3 families show that our suggested approach outperforms all baselines.
arXiv Detail & Related papers (2020-04-28T05:13:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.