A Neural Network Model of Lexical Competition during Infant Spoken Word
Recognition
- URL: http://arxiv.org/abs/2006.00999v1
- Date: Mon, 1 Jun 2020 15:04:11 GMT
- Title: A Neural Network Model of Lexical Competition during Infant Spoken Word
Recognition
- Authors: Mihaela Duta and Kim Plunkett
- Abstract summary: Visual world studies show that upon hearing a word in a target-absent visual context, toddlers and adults briefly direct their gaze towards phonologically related items.
We present a neural network model that processes dynamic unfolding phonological representations and maps them to static internal semantic and visual representations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual world studies show that upon hearing a word in a target-absent visual
context containing related and unrelated items, toddlers and adults briefly
direct their gaze towards phonologically related items, before shifting towards
semantically and visually related ones. We present a neural network model that
processes dynamic unfolding phonological representations and maps them to
static internal semantic and visual representations. The model, trained on
representations derived from real corpora, simulates this early phonological
over semantic/visual preference. Our results support the hypothesis that
incremental unfolding of a spoken word is in itself sufficient to account for
the transient preference for phonological competitors over both unrelated and
semantically and visually related ones. Phonological representations mapped
dynamically in a bottom-up fashion to semantic-visual representations capture
the early phonological preference effects reported in a visual world task. The
semantic-visual preference observed later in such a trial does not require
top-down feedback from a semantic or visual system.
Related papers
- The formation of perceptual space in early phonetic acquisition: a cross-linguistic modeling approach [0.0]
This study investigates how learners organize perceptual space in early phonetic acquisition.
It examines the shape of the learned hidden representation as well as its ability to categorize phonetic categories.
arXiv Detail & Related papers (2024-07-26T04:18:36Z) - Perception of Phonological Assimilation by Neural Speech Recognition Models [3.4173734484549625]
This article explores how the neural speech recognition model Wav2Vec2 perceives assimilated sounds.
Using psycholinguistic stimuli, we analyze how various linguistic context cues influence compensation patterns in the model's output.
arXiv Detail & Related papers (2024-06-21T15:58:22Z) - Iconic Gesture Semantics [87.00251241246136]
Informational evaluation is spelled out as extended exemplification (extemplification) in terms of perceptual classification of a gesture's visual iconic model.
We argue that the perceptual classification of instances of visual communication requires a notion of meaning different from Frege/Montague frameworks.
An iconic gesture semantics is introduced which covers the full range from gesture representations over model-theoretic evaluation to inferential interpretation in dynamic semantic frameworks.
arXiv Detail & Related papers (2024-04-29T13:58:03Z) - Identifying and interpreting non-aligned human conceptual
representations using language modeling [0.0]
We show that congenital blindness induces conceptual reorganization in both a-modal and sensory-related verbal domains.
We find that blind individuals more strongly associate social and cognitive meanings to verbs related to motion.
For some verbs, representations of blind and sighted are highly similar.
arXiv Detail & Related papers (2024-03-10T13:02:27Z) - Self-supervised models of audio effectively explain human cortical
responses to speech [71.57870452667369]
We capitalize on the progress of self-supervised speech representation learning to create new state-of-the-art models of the human auditory system.
We show that these results show that self-supervised models effectively capture the hierarchy of information relevant to different stages of speech processing in human cortex.
arXiv Detail & Related papers (2022-05-27T22:04:02Z) - Learnable Visual Words for Interpretable Image Recognition [70.85686267987744]
We propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules.
The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories.
Our experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation.
arXiv Detail & Related papers (2022-05-22T03:24:45Z) - Deep Neural Convolutive Matrix Factorization for Articulatory
Representation Decomposition [48.56414496900755]
This work uses a neural implementation of convolutive sparse matrix factorization to decompose the articulatory data into interpretable gestures and gestural scores.
Phoneme recognition experiments were additionally performed to show that gestural scores indeed code phonological information successfully.
arXiv Detail & Related papers (2022-04-01T14:25:19Z) - Perception Point: Identifying Critical Learning Periods in Speech for
Bilingual Networks [58.24134321728942]
We compare and identify cognitive aspects on deep neural-based visual lip-reading models.
We observe a strong correlation between these theories in cognitive psychology and our unique modeling.
arXiv Detail & Related papers (2021-10-13T05:30:50Z) - Can phones, syllables, and words emerge as side-products of
cross-situational audiovisual learning? -- A computational investigation [2.28438857884398]
We study the so-called latent language hypothesis (LLH)
LLH connects linguistic representation learning to general predictive processing within and across sensory modalities.
We explore LLH further in extensive learning simulations with different neural network models for audiovisual cross-situational learning.
arXiv Detail & Related papers (2021-09-29T05:49:46Z) - Decomposing lexical and compositional syntax and semantics with deep
language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension.
Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations.
The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.