Training language models for deeper understanding improves brain
alignment
- URL: http://arxiv.org/abs/2212.10898v1
- Date: Wed, 21 Dec 2022 10:15:19 GMT
- Title: Training language models for deeper understanding improves brain
alignment
- Authors: Khai Loong Aw, Mariya Toneva
- Abstract summary: Building systems that achieve a deeper understanding of language is one of the central goals of natural language processing (NLP)
We show that training language models for deeper narrative understanding results in richer representations that have improved alignment to human brain activity.
- Score: 5.678337324555035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building systems that achieve a deeper understanding of language is one of
the central goals of natural language processing (NLP). Towards this goal,
recent works have begun to train language models on narrative datasets which
require extracting the most critical information by integrating across long
contexts. However, it is still an open question whether these models are
learning a deeper understanding of the text, or if the models are simply
learning a heuristic to complete the task. This work investigates this further
by turning to the one language processing system that truly understands complex
language: the human brain. We show that training language models for deeper
narrative understanding results in richer representations that have improved
alignment to human brain activity. We further find that the improvements in
brain alignment are larger for character names than for other discourse
features, which indicates that these models are learning important narrative
elements. Taken together, these results suggest that this type of training can
indeed lead to deeper language understanding. These findings have consequences
both for cognitive neuroscience by revealing some of the significant factors
behind brain-NLP alignment, and for NLP by highlighting that understanding of
long-range context can be improved beyond language modeling.
Related papers
- Language Evolution with Deep Learning [49.879239655532324]
Computational modeling plays an essential role in the study of language emergence.
It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language.
This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models.
arXiv Detail & Related papers (2024-03-18T16:52:54Z) - Causal Graph in Language Model Rediscovers Cortical Hierarchy in Human
Narrative Processing [0.0]
Previous studies have demonstrated that the features of language models can be mapped to fMRI brain activity.
This raises the question: is there a commonality between information processing in language models and the human brain?
To estimate information flow patterns in a language model, we examined the causal relationships between different layers.
arXiv Detail & Related papers (2023-11-17T10:09:12Z) - Roles of Scaling and Instruction Tuning in Language Perception: Model
vs. Human Attention [58.817405319722596]
This work compares the self-attention of several large language models (LLMs) in different sizes to assess the effect of scaling and instruction tuning on language perception.
Results show that scaling enhances the human resemblance and improves the effective attention by reducing the trivial pattern reliance, while instruction tuning does not.
We also find that current LLMs are consistently closer to non-native than native speakers in attention, suggesting a sub-optimal language perception of all models.
arXiv Detail & Related papers (2023-10-29T17:16:40Z) - Visual Grounding Helps Learn Word Meanings in Low-Data Regimes [47.7950860342515]
Modern neural language models (LMs) are powerful tools for modeling human sentence production and comprehension.
But to achieve these results, LMs must be trained in distinctly un-human-like ways.
Do models trained more naturalistically -- with grounded supervision -- exhibit more humanlike language learning?
We investigate this question in the context of word learning, a key sub-task in language acquisition.
arXiv Detail & Related papers (2023-10-20T03:33:36Z) - Retentive or Forgetful? Diving into the Knowledge Memorizing Mechanism
of Language Models [49.39276272693035]
Large-scale pre-trained language models have shown remarkable memorizing ability.
Vanilla neural networks without pre-training have been long observed suffering from the catastrophic forgetting problem.
We find that 1) Vanilla language models are forgetful; 2) Pre-training leads to retentive language models; 3) Knowledge relevance and diversification significantly influence the memory formation.
arXiv Detail & Related papers (2023-05-16T03:50:38Z) - Joint processing of linguistic properties in brains and language models [14.997785690790032]
We investigate the correspondence between the detailed processing of linguistic information by the human brain versus language models.
We find that elimination of specific linguistic properties results in a significant decrease in brain alignment.
These findings provide clear evidence for the role of specific linguistic information in the alignment between brain and language models.
arXiv Detail & Related papers (2022-12-15T19:13:42Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Understanding and Enhancing the Use of Context for Machine Translation [2.367786892039871]
This thesis focuses on understanding certain potentials of contexts in neural models and design augmentation models to benefit from them.
To translate from a source language to a target language, a neural model has to understand the meaning of constituents in the provided context.
Looking more in-depth into the role of context and the impact of data on learning models is essential to advance the NLP field.
arXiv Detail & Related papers (2021-02-20T20:19:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.