Reasoning in Transformers - Mitigating Spurious Correlations and Reasoning Shortcuts
- URL: http://arxiv.org/abs/2403.11314v1
- Date: Sun, 17 Mar 2024 19:32:12 GMT
- Title: Reasoning in Transformers - Mitigating Spurious Correlations and Reasoning Shortcuts
- Authors: Daniel Enström, Viktor Kjellberg, Moa Johansson,
- Abstract summary: Transformer language models are neural networks used for a wide variety of tasks concerning natural language.
We investigate to what extent transformers can be trained to approximate reasoning in propositional logic.
We find that SIP-BART succeeds in avoiding reasoning shortcuts, while WP-BART does not.
- Score: 1.024113475677323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer language models are neural networks used for a wide variety of tasks concerning natural language, including some that also require logical reasoning. However, a transformer model may easily learn spurious patterns in the data, short-circuiting actual reasoning. In this paper we investigate to what extent transformers can be trained to a) approximate reasoning in propositional logic while b) avoiding known reasoning shortcuts via spurious correlations in the training data. To do so, we use a dataset with known spurious correlation between truth and e.g. the number of rules in the problem. We augment the data with proofs, and train two models: a generative transformer, WP-BART, trained on problems and their whole proofs, and a neuro-symbolic model, SIP-BART, trained on individual proof steps and combining the generative transformer model BART with a symbolic proof checker. We find that SIP-BART succeeds in avoiding reasoning shortcuts, while WP-BART does not. For SIP-BART, we then identify a few remaining reasoning errors, not previously described in the literature, arising from using a pre-trained language model. These are qualitatively analysed to create a taxonomy of four different types of additional pitfalls.
Related papers
- Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data? [55.90575874130038]
Humans exhibit remarkable compositional reasoning by integrating knowledge from various sources.
We introduce a synthetic learning task to validate the potential of Transformers in replicating this skill.
We find that few-shot Chain-of-Thought prompting enables Transformers to perform compositional reasoning on FTCT.
arXiv Detail & Related papers (2025-01-27T08:34:38Z) - B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable [53.848005910548565]
'B-cosification' is a novel approach to transform existing pre-trained models to become inherently interpretable.
We find that B-cosification can yield models that are on par with B-cos models trained from scratch in terms of interpretability.
arXiv Detail & Related papers (2024-11-01T16:28:11Z) - Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization [22.033370572209744]
We study whether transformers can learn to implicitly reason over parametric knowledge.
We focus on two representative reasoning types, composition and comparison.
We find that transformers can learn implicit reasoning, but only through grokking.
arXiv Detail & Related papers (2024-05-23T21:42:19Z) - When can transformers reason with abstract symbols? [25.63285482210457]
We prove that for any relational reasoning task in a large family of tasks, transformers learn the abstract relations and generalize to the test set.
This is in contrast to classical fully-connected networks, which we prove fail to learn to reason.
arXiv Detail & Related papers (2023-10-15T06:45:38Z) - All Roads Lead to Rome? Exploring the Invariance of Transformers'
Representations [69.3461199976959]
We propose a model based on invertible neural networks, BERT-INN, to learn the Bijection Hypothesis.
We show the advantage of BERT-INN both theoretically and through extensive experiments.
arXiv Detail & Related papers (2023-05-23T22:30:43Z) - Analyzing Transformers in Embedding Space [59.434807802802105]
We present a theoretical analysis where all parameters of a trained Transformer are interpreted by projecting them into the embedding space.
We show that parameters of both pretrained and fine-tuned models can be interpreted in embedding space.
Our findings open the door to interpretation methods that, at least in part, abstract away from model specifics and operate in the embedding space only.
arXiv Detail & Related papers (2022-09-06T14:36:57Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Multi-Step Inference for Reasoning Over Paragraphs [95.91527524872832]
Complex reasoning over text requires understanding and chaining together free-form predicates and logical connectives.
We present a compositional model reminiscent of neural module networks that can perform chained logical reasoning.
arXiv Detail & Related papers (2020-04-06T21:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.