Pushing the Limits of Rule Reasoning in Transformers through Natural
Language Satisfiability
- URL: http://arxiv.org/abs/2112.09054v1
- Date: Thu, 16 Dec 2021 17:47:20 GMT
- Title: Pushing the Limits of Rule Reasoning in Transformers through Natural
Language Satisfiability
- Authors: Kyle Richardson and Ashish Sabharwal
- Abstract summary: We propose a new methodology for creating challenging algorithmic reasoning datasets.
Key idea is to draw insights from empirical sampling of hard propositional SAT problems and from complexity-theoretic studies of language.
We find that current transformers, given sufficient training data, are surprisingly robust at solving the resulting NLSat problems.
- Score: 30.01308882849197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Investigating the reasoning abilities of transformer models, and discovering
new challenging tasks for them, has been a topic of much interest. Recent
studies have found these models to be surprisingly strong at performing
deductive reasoning over formal logical theories expressed in natural language.
A shortcoming of these studies, however, is that they do not take into account
that logical theories, when sampled uniformly at random, do not necessarily
lead to hard instances. We propose a new methodology for creating challenging
algorithmic reasoning datasets that focus on natural language satisfiability
(NLSat) problems. The key idea is to draw insights from empirical sampling of
hard propositional SAT problems and from complexity-theoretic studies of
language. This methodology allows us to distinguish easy from hard instances,
and to systematically increase the complexity of existing reasoning benchmarks
such as RuleTaker. We find that current transformers, given sufficient training
data, are surprisingly robust at solving the resulting NLSat problems of
substantially increased difficulty. They also exhibit some degree of
scale-invariance - the ability to generalize to problems of larger size and
scope. Our results, however, reveal important limitations too: a careful
sampling of training data is crucial for building models that generalize to
larger problems, and transformer models' limited scale-invariance suggests they
are far from learning robust deductive reasoning algorithms.
Related papers
- Neuro-symbolic Training for Reasoning over Spatial Language [17.901249830817882]
We propose training language models with neuro-symbolic techniques that can exploit the logical rules of reasoning as constraints.
We focus on a challenging problem of spatial reasoning over text.
arXiv Detail & Related papers (2024-06-19T20:47:36Z) - Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models [13.532180752491954]
We demonstrate a dramatic breakdown of function and reasoning capabilities of state-of-the-art models trained at the largest available scales.
The breakdown is dramatic, as models show strong fluctuations across even slight problem variations that should not affect problem solving.
We take these initial observations to stimulate urgent re-assessment of the claimed capabilities of current generation of Large Language Models.
arXiv Detail & Related papers (2024-06-04T07:43:33Z) - Language Model Cascades: Token-level uncertainty and beyond [65.38515344964647]
Recent advances in language models (LMs) have led to significant improvements in quality on complex NLP tasks.
Cascading offers a simple strategy to achieve more favorable cost-quality tradeoffs.
We show that incorporating token-level uncertainty through learned post-hoc deferral rules can significantly outperform simple aggregation strategies.
arXiv Detail & Related papers (2024-04-15T21:02:48Z) - Evaluating Transformer's Ability to Learn Mildly Context-Sensitive
Languages [6.227678387562755]
Recent studies suggest that self-attention is theoretically limited in learning even some regular and context-free languages.
We test the Transformer's ability to learn mildly context-sensitive languages of varying complexities.
Our analyses show that the learned self-attention patterns and representations modeled dependency relations and demonstrated counting behavior.
arXiv Detail & Related papers (2023-09-02T08:17:29Z) - Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks.
These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer.
Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z) - The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning [80.1018596899899]
We argue that neural network models share this same preference, formalized using Kolmogorov complexity.
Our experiments show that pre-trained and even randomly language models prefer to generate low-complexity sequences.
These observations justify the trend in deep learning of unifying seemingly disparate problems with an increasingly small set of machine learning models.
arXiv Detail & Related papers (2023-04-11T17:22:22Z) - Can Transformers Reason in Fragments of Natural Language? [2.1485350418225244]
State-of-the-art deep-learning-based approaches to Natural Language Processing (NLP) are credited with various capabilities that involve reasoning with natural language texts.
We study the detection of formally valid inferences in controlled fragments of natural language for which the satisfiability problem becomes increasingly complex.
arXiv Detail & Related papers (2022-11-10T08:46:53Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.