The formation of perceptual space in early phonetic acquisition: a cross-linguistic modeling approach
- URL: http://arxiv.org/abs/2407.18501v1
- Date: Fri, 26 Jul 2024 04:18:36 GMT
- Title: The formation of perceptual space in early phonetic acquisition: a cross-linguistic modeling approach
- Authors: Frank Lihui Tan, Youngah Do,
- Abstract summary: This study investigates how learners organize perceptual space in early phonetic acquisition.
It examines the shape of the learned hidden representation as well as its ability to categorize phonetic categories.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: This study investigates how learners organize perceptual space in early phonetic acquisition by advancing previous studies in two key aspects. Firstly, it examines the shape of the learned hidden representation as well as its ability to categorize phonetic categories. Secondly, it explores the impact of training models on context-free acoustic information, without involving contextual cues, on phonetic acquisition, closely mimicking the early language learning stage. Using a cross-linguistic modeling approach, autoencoder models are trained on English and Mandarin and evaluated in both native and non-native conditions, following experimental conditions used in infant language perception studies. The results demonstrate that unsupervised bottom-up training on context-free acoustic information leads to comparable learned representations of perceptual space between native and non-native conditions for both English and Mandarin, resembling the early stage of universal listening in infants. These findings provide insights into the organization of perceptual space during early phonetic acquisition and contribute to our understanding of the formation and representation of phonetic categories.
Related papers
- Audio-Visual Neural Syntax Acquisition [91.14892278795892]
We study phrase structure induction from visually-grounded speech.
We present the Audio-Visual Neural Syntax Learner (AV-NSL) that learns phrase structure by listening to audio and looking at images, without ever being exposed to text.
arXiv Detail & Related papers (2023-10-11T16:54:57Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - Self-supervised models of audio effectively explain human cortical
responses to speech [71.57870452667369]
We capitalize on the progress of self-supervised speech representation learning to create new state-of-the-art models of the human auditory system.
We show that these results show that self-supervised models effectively capture the hierarchy of information relevant to different stages of speech processing in human cortex.
arXiv Detail & Related papers (2022-05-27T22:04:02Z) - Perception Point: Identifying Critical Learning Periods in Speech for
Bilingual Networks [58.24134321728942]
We compare and identify cognitive aspects on deep neural-based visual lip-reading models.
We observe a strong correlation between these theories in cognitive psychology and our unique modeling.
arXiv Detail & Related papers (2021-10-13T05:30:50Z) - Can phones, syllables, and words emerge as side-products of
cross-situational audiovisual learning? -- A computational investigation [2.28438857884398]
We study the so-called latent language hypothesis (LLH)
LLH connects linguistic representation learning to general predictive processing within and across sensory modalities.
We explore LLH further in extensive learning simulations with different neural network models for audiovisual cross-situational learning.
arXiv Detail & Related papers (2021-09-29T05:49:46Z) - Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning
for Low-Resource Speech Recognition [159.9312272042253]
Wav-BERT is a cooperative acoustic and linguistic representation learning method.
We unify a pre-trained acoustic model (wav2vec 2.0) and a language model (BERT) into an end-to-end trainable framework.
arXiv Detail & Related papers (2021-09-19T16:39:22Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - Evaluating computational models of infant phonetic learning across
languages [31.587496924289972]
In the first year of life, infants' speech perception becomes attuned to the sounds of their native language.
Many accounts of this early phonetic learning exist, but computational models predicting the patterns observed in infants from the speech input they hear have been lacking.
Here we study five such algorithms, selected for their potential cognitive relevance. We simulate phonetic learning with each algorithm and perform tests on three phone contrasts from different languages, comparing the results to infants' discrimination patterns.
arXiv Detail & Related papers (2020-08-06T22:07:45Z) - A Neural Network Model of Lexical Competition during Infant Spoken Word
Recognition [0.0]
Visual world studies show that upon hearing a word in a target-absent visual context, toddlers and adults briefly direct their gaze towards phonologically related items.
We present a neural network model that processes dynamic unfolding phonological representations and maps them to static internal semantic and visual representations.
arXiv Detail & Related papers (2020-06-01T15:04:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.