Demystifying Neural Language Models' Insensitivity to Word-Order
- URL: http://arxiv.org/abs/2107.13955v1
- Date: Thu, 29 Jul 2021 13:34:20 GMT
- Title: Demystifying Neural Language Models' Insensitivity to Word-Order
- Authors: Louis Clouatre, Prasanna Parthasarathi, Amal Zouaq, Sarath Chandar
- Abstract summary: We investigate the insensitivity of natural language models to word-order by quantifying perturbations.
We find that neural language models require local ordering more so than the global ordering of tokens.
- Score: 7.72780997900827
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research analyzing the sensitivity of natural language understanding
models to word-order perturbations have shown that the state-of-the-art models
in several language tasks may have a unique way to understand the text that
could seldom be explained with conventional syntax and semantics. In this
paper, we investigate the insensitivity of natural language models to
word-order by quantifying perturbations and analysing their effect on neural
models' performance on language understanding tasks in GLUE benchmark. Towards
that end, we propose two metrics - the Direct Neighbour Displacement (DND) and
the Index Displacement Count (IDC) - that score the local and global ordering
of tokens in the perturbed texts and observe that perturbation functions found
in prior literature affect only the global ordering while the local ordering
remains relatively unperturbed. We propose perturbations at the granularity of
sub-words and characters to study the correlation between DND, IDC and the
performance of neural language models on natural language tasks. We find that
neural language models - pretrained and non-pretrained Transformers, LSTMs, and
Convolutional architectures - require local ordering more so than the global
ordering of tokens. The proposed metrics and the suite of perturbations allow a
systematic way to study the (in)sensitivity of neural language understanding
models to varying degree of perturbations.
Related papers
- Analysis of Argument Structure Constructions in a Deep Recurrent Language Model [0.0]
We explore the representation and processing of Argument Structure Constructions (ASCs) in a recurrent neural language model.
Our results show that sentence representations form distinct clusters corresponding to the four ASCs across all hidden layers.
This indicates that even a relatively simple, brain-constrained recurrent neural network can effectively differentiate between various construction types.
arXiv Detail & Related papers (2024-08-06T09:27:41Z) - Investigating the Timescales of Language Processing with EEG and Language Models [0.0]
This study explores the temporal dynamics of language processing by examining the alignment between word representations from a pre-trained language model and EEG data.
Using a Temporal Response Function (TRF) model, we investigate how neural activity corresponds to model representations across different layers.
Our analysis reveals patterns in TRFs from distinct layers, highlighting varying contributions to lexical and compositional processing.
arXiv Detail & Related papers (2024-06-28T12:49:27Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - Neural-Symbolic Recursive Machine for Systematic Generalization [113.22455566135757]
We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS)
NSR integrates neural perception, syntactic parsing, and semantic reasoning.
We evaluate NSR's efficacy across four challenging benchmarks designed to probe systematic generalization capabilities.
arXiv Detail & Related papers (2022-10-04T13:27:38Z) - Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with
Controllable Perturbations [2.041108289731398]
Recent research has adopted a new experimental field centered around the concept of text perturbations.
Recent research has revealed that shuffled word order has little to no impact on the downstream performance of Transformer-based language models.
arXiv Detail & Related papers (2021-09-28T20:15:29Z) - Lexicon Learning for Few-Shot Neural Sequence Modeling [32.49689188570872]
We present a lexical translation mechanism that generalizes existing copy mechanisms to incorporate learned, decontextualized, token-level translation rules.
It improves systematic generalization on a diverse set of sequence modeling tasks drawn from cognitive science, formal semantics, and machine translation.
arXiv Detail & Related papers (2021-06-07T22:35:04Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z) - Logical Natural Language Generation from Open-Domain Tables [107.04385677577862]
We propose a new task where a model is tasked with generating natural language statements that can be emphlogically entailed by the facts.
To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset citechen 2019tabfact featured with a wide range of logical/symbolic inferences.
The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order.
arXiv Detail & Related papers (2020-04-22T06:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.