Emergence of Separable Manifolds in Deep Language Representations
- URL: http://arxiv.org/abs/2006.01095v4
- Date: Wed, 8 Jul 2020 22:10:19 GMT
- Title: Emergence of Separable Manifolds in Deep Language Representations
- Authors: Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang,
Yoon Kim, SueYeon Chung
- Abstract summary: Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities.
Recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain.
DNNs have subsequently become a popular model class to infer computational principles underlying complex cognitive functions.
- Score: 26.002842878797765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have shown much empirical success in solving
perceptual tasks across various cognitive modalities. While they are only
loosely inspired by the biological brain, recent studies report considerable
similarities between representations extracted from task-optimized DNNs and
neural populations in the brain. DNNs have subsequently become a popular model
class to infer computational principles underlying complex cognitive functions,
and in turn, they have also emerged as a natural testbed for applying methods
originally developed to probe information in neural populations. In this work,
we utilize mean-field theoretic manifold analysis, a recent technique from
computational neuroscience that connects geometry of feature representations
with linear separability of classes, to analyze language representations from
large-scale contextual embedding models. We explore representations from
different model families (BERT, RoBERTa, GPT, etc.) and find evidence for
emergence of linguistic manifolds across layer depth (e.g., manifolds for
part-of-speech tags), especially in ambiguous data (i.e, words with multiple
part-of-speech tags, or part-of-speech classes including many words). In
addition, we find that the emergence of linear separability in these manifolds
is driven by a combined reduction of manifolds' radius, dimensionality and
inter-manifold correlations.
Related papers
- Brain-like Functional Organization within Large Language Models [58.93629121400745]
The human brain has long inspired the pursuit of artificial intelligence (AI)
Recent neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli.
In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs)
This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within large language models (LLMs)
arXiv Detail & Related papers (2024-10-25T13:15:17Z) - Analysis of Argument Structure Constructions in a Deep Recurrent Language Model [0.0]
We explore the representation and processing of Argument Structure Constructions (ASCs) in a recurrent neural language model.
Our results show that sentence representations form distinct clusters corresponding to the four ASCs across all hidden layers.
This indicates that even a relatively simple, brain-constrained recurrent neural network can effectively differentiate between various construction types.
arXiv Detail & Related papers (2024-08-06T09:27:41Z) - Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
We classify neurons into four distinct categories based on their responses to a specific input across different languages.
Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z) - Experimental Observations of the Topology of Convolutional Neural
Network Activations [2.4235626091331737]
Topological data analysis provides compact, noise-robust representations of complex structures.
Deep neural networks (DNNs) learn millions of parameters associated with a series of transformations defined by the model architecture.
In this paper, we apply cutting edge techniques from TDA with the goal of gaining insight into the interpretability of convolutional neural networks used for image classification.
arXiv Detail & Related papers (2022-12-01T02:05:44Z) - Connecting Neural Response measurements & Computational Models of
language: a non-comprehensive guide [5.523143941738335]
Recent advances in language modelling and in neuroimaging promise potential improvements in the investigation of language's neurobiology.
This survey traces a line from early research linking Event Related Potentials and complexity measures derived from simple language models to contemporary studies employing Artificial Neural Network models trained on large corpora.
arXiv Detail & Related papers (2022-03-10T11:24:54Z) - Model-based analysis of brain activity reveals the hierarchy of language
in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli.
Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Does injecting linguistic structure into language models lead to better
alignment with brain recordings? [13.880819301385854]
We evaluate whether language models align better with brain recordings if their attention is biased by annotations from syntactic or semantic formalisms.
Our proposed approach enables the evaluation of more targeted hypotheses about the composition of meaning in the brain.
arXiv Detail & Related papers (2021-01-29T14:42:02Z) - Analyzing Individual Neurons in Pre-trained Language Models [41.07850306314594]
We find small subsets of neurons to predict linguistic tasks, with lower level tasks localized in fewer neurons, compared to higher level task of predicting syntax.
For example, we found neurons in XLNet to be more localized and disjoint when predicting properties compared to BERT and others, where they are more distributed and coupled.
arXiv Detail & Related papers (2020-10-06T13:17:38Z) - Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts.
We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.