Using Diachronic Distributed Word Representations as Models of Lexical
Development in Children
- URL: http://arxiv.org/abs/2105.05091v1
- Date: Tue, 11 May 2021 14:44:05 GMT
- Title: Using Diachronic Distributed Word Representations as Models of Lexical
Development in Children
- Authors: Arijit Gupta, Rajaswa Patil and Veeky Baths
- Abstract summary: We use diachronic distributed word representations to perform temporal modeling and analysis of lexical development in children.
We demonstrate the dynamics of growing lexical knowledge in children over time, as compared against a saturated level of lexical knowledge in child-directed adult speech.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has shown that distributed word representations can encode
abstract semantic and syntactic information from child-directed speech. In this
paper, we use diachronic distributed word representations to perform temporal
modeling and analysis of lexical development in children. Unlike all previous
work, we use temporally sliced speech corpus to learn distributed word
representations of child and child-directed speech. Through our modeling
experiments, we demonstrate the dynamics of growing lexical knowledge in
children over time, as compared against a saturated level of lexical knowledge
in child-directed adult speech. We also fit linear mixed-effects models with
the rate of semantic change in the diachronic representations and word
frequencies. This allows us to inspect the role of word frequencies towards
lexical development in children. Further, we perform a qualitative analysis of
the diachronic representations from our model, which reveals the categorization
and word associations in the mental lexicon of children.
Related papers
- A model of early word acquisition based on realistic-scale audiovisual naming events [10.047470656294333]
We studied the extent to which early words can be acquired through statistical learning from regularities in audiovisual sensory input.
We simulated word learning in infants up to 12 months of age in a realistic setting, using a model that learns from statistical regularities in raw speech and pixel-level visual input.
Results show that the model effectively learns to recognize words and associate them with corresponding visual objects, with a vocabulary growth rate comparable to that observed in infants.
arXiv Detail & Related papers (2024-06-07T21:05:59Z) - Caregiver Talk Shapes Toddler Vision: A Computational Study of Dyadic
Play [8.164232628099619]
We propose a computational model of visual representation learning during dyadic play.
We show that utterances with statistics matching those of real caregivers give rise to representations supporting improved category recognition.
arXiv Detail & Related papers (2023-12-07T08:18:40Z) - Visual Grounding Helps Learn Word Meanings in Low-Data Regimes [47.7950860342515]
Modern neural language models (LMs) are powerful tools for modeling human sentence production and comprehension.
But to achieve these results, LMs must be trained in distinctly un-human-like ways.
Do models trained more naturalistically -- with grounded supervision -- exhibit more humanlike language learning?
We investigate this question in the context of word learning, a key sub-task in language acquisition.
arXiv Detail & Related papers (2023-10-20T03:33:36Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic
Word Embeddings [19.195728241989702]
We propose a multi-task learning model that incorporates top-down lexical knowledge into the training procedure of acoustic word embeddings.
We experiment with three languages and demonstrate that incorporating lexical knowledge improves the embedding space discriminability.
arXiv Detail & Related papers (2022-09-14T13:33:04Z) - VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning [113.50220968583353]
We propose to discover semantic embeddings containing discriminative visual properties for zero-shot learning.
Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity.
We demonstrate that our visually-grounded semantic embeddings further improve performance over word embeddings across various ZSL models by a large margin.
arXiv Detail & Related papers (2022-03-20T03:49:02Z) - Perception Point: Identifying Critical Learning Periods in Speech for
Bilingual Networks [58.24134321728942]
We compare and identify cognitive aspects on deep neural-based visual lip-reading models.
We observe a strong correlation between these theories in cognitive psychology and our unique modeling.
arXiv Detail & Related papers (2021-10-13T05:30:50Z) - Word Acquisition in Neural Language Models [0.38073142980733]
We investigate how neural language models acquire individual words during training, extracting learning curves and ages of acquisition for over 600 words.
We find that the effects of concreteness, word length, and lexical class are pointedly different in children and language models.
arXiv Detail & Related papers (2021-10-05T23:26:16Z) - Decomposing lexical and compositional syntax and semantics with deep
language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension.
Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations.
The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z) - Using Known Words to Learn More Words: A Distributional Analysis of
Child Vocabulary Development [0.0]
We investigated item-based variability in vocabulary development using lexical properties of distributional statistics.
We predicted word trajectories cross-sectionally, shedding light on trends in vocabulary development that may not have been evident at a single time point.
We also show that whether one looks at a single age group or across ages as a whole, the best distributional predictor of whether a child knows a word is the number of other known words with which that word tends to co-occur.
arXiv Detail & Related papers (2020-09-15T01:18:21Z) - Where New Words Are Born: Distributional Semantic Analysis of Neologisms
and Their Semantic Neighborhoods [51.34667808471513]
We investigate the importance of two factors, semantic sparsity and frequency growth rates of semantic neighbors, formalized in the distributional semantics paradigm.
We show that both factors are predictive word emergence although we find more support for the latter hypothesis.
arXiv Detail & Related papers (2020-01-21T19:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.