Studying word order through iterative shuffling
- URL: http://arxiv.org/abs/2109.04867v1
- Date: Fri, 10 Sep 2021 13:27:06 GMT
- Title: Studying word order through iterative shuffling
- Authors: Nikolay Malkin, Sameera Lanka, Pranav Goel, Nebojsa Jojic
- Abstract summary: We show that word order encodes meaning essential to performing NLP benchmark tasks.
We use IBIS, a novel, efficient procedure that finds the ordering of a bag of words having the highest likelihood under a fixed language model.
We discuss how shuffling inference procedures such as IBIS can benefit language modeling and constrained generation.
- Score: 14.530986799844873
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As neural language models approach human performance on NLP benchmark tasks,
their advances are widely seen as evidence of an increasingly complex
understanding of syntax. This view rests upon a hypothesis that has not yet
been empirically tested: that word order encodes meaning essential to
performing these tasks. We refute this hypothesis in many cases: in the GLUE
suite and in various genres of English text, the words in a sentence or phrase
can rarely be permuted to form a phrase carrying substantially different
information. Our surprising result relies on inference by iterative shuffling
(IBIS), a novel, efficient procedure that finds the ordering of a bag of words
having the highest likelihood under a fixed language model. IBIS can use any
black-box model without additional training and is superior to existing word
ordering algorithms. Coalescing our findings, we discuss how shuffling
inference procedures such as IBIS can benefit language modeling and constrained
generation.
Related papers
- Learning Interpretable Queries for Explainable Image Classification with
Information Pursuit [18.089603786027503]
Information Pursuit (IP) is an explainable prediction algorithm that greedily selects a sequence of interpretable queries about the data.
This paper introduces a novel approach: learning a dictionary of interpretable queries directly from the dataset.
arXiv Detail & Related papers (2023-12-16T21:43:07Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Towards preserving word order importance through Forced Invalidation [80.33036864442182]
We show that pre-trained language models are insensitive to word order.
We propose Forced Invalidation to help preserve the importance of word order.
Our experiments demonstrate that Forced Invalidation significantly improves the sensitivity of the models to word order.
arXiv Detail & Related papers (2023-04-11T13:42:10Z) - Word Order Does Matter (And Shuffled Language Models Know It) [9.990431777927421]
Recent studies have shown that language models pretrained and/or fine-tuned on randomly permuted sentences exhibit competitive performance on GLUE.
We investigate what position embeddings learned from shuffled text encode, showing that these models retain information pertaining to the original, naturalistic word order.
arXiv Detail & Related papers (2022-03-21T14:10:15Z) - Dict-BERT: Enhancing Language Model Pre-training with Dictionary [42.0998323292348]
Pre-trained language models (PLMs) aim to learn universal language representations by conducting self-supervised training tasks on large-scale corpora.
In this work, we focus on enhancing language model pre-training by leveraging definitions of rare words in dictionaries.
We propose two novel self-supervised pre-training tasks on word and sentence-level alignment between input text sequence and rare word definitions.
arXiv Detail & Related papers (2021-10-13T04:29:14Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Words aren't enough, their order matters: On the Robustness of Grounding
Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition.
We show that 83.7% of test instances do not require reasoning on linguistic structure.
We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.