TLT-school: a Corpus of Non Native Children Speech
- URL: http://arxiv.org/abs/2001.08051v1
- Date: Wed, 22 Jan 2020 15:14:09 GMT
- Title: TLT-school: a Corpus of Non Native Children Speech
- Authors: Roberto Gretter, Marco Matassoni, Stefano Bann\`o, Daniele Falavigna
- Abstract summary: This paper describes "TLT-school" a corpus of speech utterances collected in schools of northern Italy for assessing the performance of students learning both English and German.
The corpus was recorded in the years 2017 and 2018 from students aged between nine and sixteen years, attending primary, middle and high school.
All utterances have been scored, in terms of some predefined proficiency indicators, by human experts.
- Score: 7.417312533172291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes "TLT-school" a corpus of speech utterances collected in
schools of northern Italy for assessing the performance of students learning
both English and German. The corpus was recorded in the years 2017 and 2018
from students aged between nine and sixteen years, attending primary, middle
and high school. All utterances have been scored, in terms of some predefined
proficiency indicators, by human experts. In addition, most of utterances
recorded in 2017 have been manually transcribed carefully. Guidelines and
procedures used for manual transcriptions of utterances will be described in
detail, as well as results achieved by means of an automatic speech recognition
system developed by us. Part of the corpus is going to be freely distributed to
scientific community particularly interested both in non-native speech
recognition and automatic assessment of second language proficiency.
Related papers
- Teochew-Wild: The First In-the-wild Teochew Dataset with Orthographic Annotations [2.4901756414164846]
This paper reports the construction of the Teochew-Wild, a speech corpus of the Teochew dialect.<n>The corpus includes 18.9 hours of in-the-wild Teochew speech data from multiple speakers.<n>To the best of our knowledge, this is the first publicly available Teochew dataset with accurate orthographic annotations.
arXiv Detail & Related papers (2025-05-08T08:47:11Z) - Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives [15.669164862460342]
We develop automatic speech recognition systems for stories told by Afrikaans and isiXhosa preschool children.
We consider a range of prior child-speech ASR strategies to determine which is best suited to this unique setting.
arXiv Detail & Related papers (2025-01-11T08:11:09Z) - FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks [27.894172151026044]
FLEURS-R is a speech restoration applied version of the Few-shot Learning Evaluation of Universal Representations of Speech corpus.
The aim of FLEURS-R is to advance speech technology in more languages and catalyze research including text-to-speech.
arXiv Detail & Related papers (2024-08-12T15:28:51Z) - Towards Unsupervised Speech Recognition Without Pronunciation Models [57.222729245842054]
In this article, we tackle the challenge of developing ASR systems without paired speech and text corpora.
We experimentally demonstrate that an unsupervised speech recognizer can emerge from joint speech-to-speech and text-to-text masked token-infilling.
This innovative model surpasses the performance of previous unsupervised ASR models under the lexicon-free setting.
arXiv Detail & Related papers (2024-06-12T16:30:58Z) - Speech Corpus for Korean Children with Autism Spectrum Disorder: Towards
Automatic Assessment Systems [7.153773998764661]
This paper introduces a speech corpus specifically designed for Korean children with ASD.
Three speech and language pathologists rated recordings for social communication severity (SCS) and pronunciation proficiency (PP) using a 3-point Likert scale.
The paper also analyzes acoustic and linguistic features extracted from speech data collected and completed for annotation from 73 children with ASD and 9 TD children.
arXiv Detail & Related papers (2024-02-23T07:32:54Z) - Who Said What? An Automated Approach to Analyzing Speech in Preschool Classrooms [0.4207829324073153]
We propose an automated framework that uses software to classify speakers and to transcribe their utterances.
We compare results from our framework to those from a human expert for 110 minutes of classroom recordings.
The results suggest substantial progress in analyzing classroom speech that may support children's language development.
arXiv Detail & Related papers (2024-01-14T18:27:37Z) - SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic
Organization in HuBERT [49.06057768982775]
We show that a syllabic organization emerges in learning sentence-level representation of speech.
We propose a new benchmark task, Spoken Speech ABX, for evaluating sentence-level representation of speech.
arXiv Detail & Related papers (2023-10-16T20:05:36Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - DisfluencyFixer: A tool to enhance Language Learning through Speech To
Speech Disfluency Correction [50.51901599433536]
DisfluencyFixer is a tool that performs speech-to-speech disfluency correction in English and Hindi.
Our proposed system removes disfluencies from input speech and returns fluent speech as output.
arXiv Detail & Related papers (2023-05-26T14:13:38Z) - Building a Non-native Speech Corpus Featuring Chinese-English Bilingual
Children: Compilation and Rationale [3.924235219960689]
This paper introduces a non-native speech corpus consisting of narratives from fifty 5- to 6-year-old Chinese-English children.
Transcripts totaling 6.5 hours of children taking a narrative comprehension test in English (L2) are presented, along with human-rated scores and annotations of grammatical and pronunciation errors.
The children also completed the parallel MAIN tests in Chinese (L1) for reference purposes.
arXiv Detail & Related papers (2023-04-30T10:41:43Z) - BLASER: A Text-Free Speech-to-Speech Translation Evaluation Metric [66.73705349465207]
End-to-end speech-to-speech translation (S2ST) is generally evaluated with text-based metrics.
We propose a text-free evaluation metric for end-to-end S2ST, named BLASER, to avoid the dependency on ASR systems.
arXiv Detail & Related papers (2022-12-16T14:00:26Z) - Limited Data Emotional Voice Conversion Leveraging Text-to-Speech:
Two-stage Sequence-to-Sequence Training [91.95855310211176]
Emotional voice conversion aims to change the emotional state of an utterance while preserving the linguistic content and speaker identity.
We propose a novel 2-stage training strategy for sequence-to-sequence emotional voice conversion with a limited amount of emotional speech data.
The proposed framework can perform both spectrum and prosody conversion and achieves significant improvement over the state-of-the-art baselines in both objective and subjective evaluation.
arXiv Detail & Related papers (2021-03-31T04:56:14Z) - UniSpeech: Unified Speech Representation Learning with Labeled and
Unlabeled Data [54.733889961024445]
We propose a unified pre-training approach called UniSpeech to learn speech representations with both unlabeled and labeled data.
We evaluate the effectiveness of UniSpeech for cross-lingual representation learning on public CommonVoice corpus.
arXiv Detail & Related papers (2021-01-19T12:53:43Z) - FT Speech: Danish Parliament Speech Corpus [21.190182627955817]
This paper introduces FT Speech, a new speech corpus created from the recorded meetings of the Danish Parliament.
The corpus contains over 1,800 hours of transcribed speech by a total of 434 speakers.
It is significantly larger in duration, vocabulary, and amount of spontaneous speech than the existing public speech corpora for Danish.
arXiv Detail & Related papers (2020-05-25T19:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.