Proficiency assessment of L2 spoken English using wav2vec 2.0
- URL: http://arxiv.org/abs/2210.13168v1
- Date: Mon, 24 Oct 2022 12:36:49 GMT
- Title: Proficiency assessment of L2 spoken English using wav2vec 2.0
- Authors: Stefano Bann\`o and Marco Matassoni
- Abstract summary: We use wav2vec 2.0 for assessing overall and individual aspects of proficiency on two small datasets.
We find that this approach significantly outperforms the BERT-based baseline system trained on ASR and manual transcriptions used for comparison.
- Score: 3.4012007729454816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing demand for learning English as a second language has led to a
growing interest in methods for automatically assessing spoken language
proficiency. Most approaches use hand-crafted features, but their efficacy
relies on their particular underlying assumptions and they risk discarding
potentially salient information about proficiency. Other approaches rely on
transcriptions produced by ASR systems which may not provide a faithful
rendition of a learner's utterance in specific scenarios (e.g., non-native
children's spontaneous speech). Furthermore, transcriptions do not yield any
information about relevant aspects such as intonation, rhythm or prosody. In
this paper, we investigate the use of wav2vec 2.0 for assessing overall and
individual aspects of proficiency on two small datasets, one of which is
publicly available. We find that this approach significantly outperforms the
BERT-based baseline system trained on ASR and manual transcriptions used for
comparison.
Related papers
- Prosody in Cascade and Direct Speech-to-Text Translation: a case study
on Korean Wh-Phrases [79.07111754406841]
This work proposes using contrastive evaluation to measure the ability of direct S2TT systems to disambiguate utterances where prosody plays a crucial role.
Our results clearly demonstrate the value of direct translation systems over cascade translation models.
arXiv Detail & Related papers (2024-02-01T14:46:35Z) - Leveraging Cross-Lingual Transfer Learning in Spoken Named Entity Recognition Systems [1.2494184403263342]
We apply transfer learning techniques across Dutch, English, and German using both pipeline and End-to-End approaches.
We employ Wav2Vec2 XLS-R models on custom pseudo-annotated datasets to evaluate the adaptability of cross-lingual systems.
arXiv Detail & Related papers (2023-07-03T19:30:24Z) - Automatic Speech Recognition of Non-Native Child Speech for Language
Learning Applications [18.849741353784328]
We assess the performance of two state-of-the-art ASR systems, Wav2Vec2.0 and Whisper AI.
We evaluate their performance on read and extemporaneous speech of native and non-native Dutch children.
arXiv Detail & Related papers (2023-06-29T06:14:26Z) - Hindi as a Second Language: Improving Visually Grounded Speech with
Semantically Similar Samples [89.16814518860357]
The objective of this work is to explore the learning of visually grounded speech models (VGS) from multilingual perspective.
Our key contribution in this work is to leverage the power of a high-resource language in a bilingual visually grounded speech model to improve the performance of a low-resource language.
arXiv Detail & Related papers (2023-03-30T16:34:10Z) - L2 proficiency assessment using self-supervised speech representations [35.70742768910494]
This work extends the initial analysis conducted on a self-supervised speech representation based scheme, requiring no speech recognition, to a large scale proficiency test.
The performance of the self-supervised, wav2vec 2.0, system is compared to a high performance hand-crafted assessment system and a BERT-based text system.
Though the wav2vec 2.0 based system is found to be sensitive to the nature of the response, it can be configured to yield comparable performance to systems requiring a speech transcription.
arXiv Detail & Related papers (2022-11-16T11:47:20Z) - Knowledge-Rich BERT Embeddings for Readability Assessment [0.0]
We propose an alternative way of utilizing the information-rich embeddings of BERT models through a joint-learning method.
Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets.
arXiv Detail & Related papers (2021-06-15T07:37:48Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - Improving Cross-Lingual Reading Comprehension with Self-Training [62.73937175625953]
Current state-of-the-art models even surpass human performance on several benchmarks.
Previous works have revealed the abilities of pre-trained multilingual models for zero-shot cross-lingual reading comprehension.
This paper further utilized unlabeled data to improve the performance.
arXiv Detail & Related papers (2021-05-08T08:04:30Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.