Is the Computation of Abstract Sameness Relations Human-Like in Neural
Language Models?
- URL: http://arxiv.org/abs/2205.06149v1
- Date: Thu, 12 May 2022 15:19:54 GMT
- Title: Is the Computation of Abstract Sameness Relations Human-Like in Neural
Language Models?
- Authors: Lukas Thoma, Benjamin Roth
- Abstract summary: This work explores whether state-of-the-art NLP models exhibit elementary mechanisms known from human cognition.
The computation of "abstract sameness relations" is assumed to play an important role in human language acquisition and processing.
- Score: 4.0810783261728565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, deep neural language models have made strong progress in
various NLP tasks. This work explores one facet of the question whether
state-of-the-art NLP models exhibit elementary mechanisms known from human
cognition. The exploration is focused on a relatively primitive mechanism for
which there is a lot of evidence from various psycholinguistic experiments with
infants. The computation of "abstract sameness relations" is assumed to play an
important role in human language acquisition and processing, especially in
learning more complex grammar rules. In order to investigate this mechanism in
BERT and other pre-trained language models (PLMs), the experiment designs from
studies with infants were taken as the starting point. On this basis, we
designed experimental settings in which each element from the original studies
was mapped to a component of language models. Even though the task in our
experiments was relatively simple, the results suggest that the cognitive
faculty of computing abstract sameness relations is stronger in infants than in
all investigated PLMs.
Related papers
- Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network [16.317199232071232]
Large Language Models (LLMs) have been shown to be effective models of the human language system.
In this work, we investigate the key architectural components driving the surprising alignment of untrained models.
arXiv Detail & Related papers (2024-06-21T12:54:03Z) - Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
We classify neurons into four distinct categories based on their responses to a specific input across different languages.
Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z) - Language Evolution with Deep Learning [49.879239655532324]
Computational modeling plays an essential role in the study of language emergence.
It aims to simulate the conditions and learning processes that could trigger the emergence of a structured language.
This chapter explores another class of computational models that have recently revolutionized the field of machine learning: deep learning models.
arXiv Detail & Related papers (2024-03-18T16:52:54Z) - Exploring Spatial Schema Intuitions in Large Language and Vision Models [8.944921398608063]
We investigate whether large language models (LLMs) effectively capture implicit human intuitions about building blocks of language.
Surprisingly, correlations between model outputs and human responses emerge, revealing adaptability without a tangible connection to embodied experiences.
This research contributes to a nuanced understanding of the interplay between language, spatial experiences, and computations made by large language models.
arXiv Detail & Related papers (2024-02-01T19:25:50Z) - Neural Language Models are not Born Equal to Fit Brain Data, but
Training Helps [75.84770193489639]
We examine the impact of test loss, training corpus and model architecture on the prediction of functional Magnetic Resonance Imaging timecourses of participants listening to an audiobook.
We find that untrained versions of each model already explain significant amount of signal in the brain by capturing similarity in brain responses across identical words.
We suggest good practices for future studies aiming at explaining the human language system using neural language models.
arXiv Detail & Related papers (2022-07-07T15:37:17Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - A Novel Ontology-guided Attribute Partitioning Ensemble Learning Model
for Early Prediction of Cognitive Deficits using Quantitative Structural MRI
in Very Preterm Infants [3.731292216299279]
Brain maturation and geometric features can be used with machine learning models for predicting later neurodevelopmental deficits.
We developed an ensemble learning framework, which is referred to as OAP Ensemble Learning (OAP-EL)
We applied the OAP-EL to predict cognitive deficits at 2 year of age using quantitative brain maturation and geometric features obtained at term equivalent age in very preterm infants.
arXiv Detail & Related papers (2022-02-08T20:26:42Z) - Schr\"odinger's Tree -- On Syntax and Neural Language Models [10.296219074343785]
Language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities.
We observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form.
We outline the implications of the different types of research questions exhibited in studies on syntax.
arXiv Detail & Related papers (2021-10-17T18:25:23Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - Emergence of Separable Manifolds in Deep Language Representations [26.002842878797765]
Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities.
Recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain.
DNNs have subsequently become a popular model class to infer computational principles underlying complex cognitive functions.
arXiv Detail & Related papers (2020-06-01T17:23:44Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.