Can Transformers Reason in Fragments of Natural Language?
- URL: http://arxiv.org/abs/2211.05417v1
- Date: Thu, 10 Nov 2022 08:46:53 GMT
- Title: Can Transformers Reason in Fragments of Natural Language?
- Authors: Viktor Schlegel, Kamen V. Pavlov, Ian Pratt-Hartmann
- Abstract summary: State-of-the-art deep-learning-based approaches to Natural Language Processing (NLP) are credited with various capabilities that involve reasoning with natural language texts.
We study the detection of formally valid inferences in controlled fragments of natural language for which the satisfiability problem becomes increasingly complex.
- Score: 2.1485350418225244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art deep-learning-based approaches to Natural Language
Processing (NLP) are credited with various capabilities that involve reasoning
with natural language texts. In this paper we carry out a large-scale empirical
study investigating the detection of formally valid inferences in controlled
fragments of natural language for which the satisfiability problem becomes
increasingly complex. We find that, while transformer-based language models
perform surprisingly well in these scenarios, a deeper analysis re-veals that
they appear to overfit to superficial patterns in the data rather than
acquiring the logical principles governing the reasoning in these fragments.
Related papers
- Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Empower Nested Boolean Logic via Self-Supervised Curriculum Learning [67.46052028752327]
We find that any pre-trained language models even including large language models only behave like a random selector in the face of multi-nested logic.
To empower language models with this fundamental capability, this paper proposes a new self-supervised learning method textitCurriculum Logical Reasoning (textscClr)
arXiv Detail & Related papers (2023-10-09T06:54:02Z) - Evaluating Transformer's Ability to Learn Mildly Context-Sensitive
Languages [6.227678387562755]
Recent studies suggest that self-attention is theoretically limited in learning even some regular and context-free languages.
We test the Transformer's ability to learn mildly context-sensitive languages of varying complexities.
Our analyses show that the learned self-attention patterns and representations modeled dependency relations and demonstrated counting behavior.
arXiv Detail & Related papers (2023-09-02T08:17:29Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Color Overmodification Emerges from Data-Driven Learning and Pragmatic
Reasoning [53.088796874029974]
We show that speakers' referential expressions depart from communicative ideals in ways that help illuminate the nature of pragmatic language use.
By adopting neural networks as learning agents, we show that overmodification is more likely with environmental features that are infrequent or salient.
arXiv Detail & Related papers (2022-05-18T18:42:43Z) - AbductionRules: Training Transformers to Explain Unexpected Inputs [2.2630663834223763]
We present AbductionRules, a group of datasets designed to train and test generalisable abduction over natural-language knowledge bases.
We discuss the viability of this approach to abductive reasoning and ways in which it may be improved in future work.
arXiv Detail & Related papers (2022-03-23T04:18:30Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Uncovering More Shallow Heuristics: Probing the Natural Language
Inference Capacities of Transformer-Based Pre-Trained Language Models Using
Syllogistic Patterns [9.031827448667086]
We explore the shallows used by transformer-based pre-trained language models (PLMs) that are fine-tuned for natural language inference (NLI)
We find evidence that the models rely heavily on certain shallows, picking up on symmetries and asymmetries between premise and hypothesis.
arXiv Detail & Related papers (2022-01-19T14:15:41Z) - Pushing the Limits of Rule Reasoning in Transformers through Natural
Language Satisfiability [30.01308882849197]
We propose a new methodology for creating challenging algorithmic reasoning datasets.
Key idea is to draw insights from empirical sampling of hard propositional SAT problems and from complexity-theoretic studies of language.
We find that current transformers, given sufficient training data, are surprisingly robust at solving the resulting NLSat problems.
arXiv Detail & Related papers (2021-12-16T17:47:20Z) - On the Transferability of Neural Models of Morphological Analogies [7.89271130004391]
In this paper, we focus on morphological tasks and we propose a deep learning approach to detect morphological analogies.
We present an empirical study to see how our framework transfers across languages, and that highlights interesting similarities and differences between these languages.
In view of these results, we also discuss the possibility of building a multilingual morphological model.
arXiv Detail & Related papers (2021-08-09T11:08:33Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.