Learning to Reason With Relational Abstractions
- URL: http://arxiv.org/abs/2210.02615v1
- Date: Thu, 6 Oct 2022 00:27:50 GMT
- Title: Learning to Reason With Relational Abstractions
- Authors: Andrew J. Nam, Mengye Ren, Chelsea Finn, James L. McClelland
- Abstract summary: We study how to build stronger reasoning capability in language models using the idea of relational abstractions.
We find that models that are supplied with such sequences as prompts can solve tasks with a significantly higher accuracy.
- Score: 65.89553417442049
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models have recently shown promising progress in mathematical
reasoning when fine-tuned with human-generated sequences walking through a
sequence of solution steps. However, the solution sequences are not formally
structured and the resulting model-generated sequences may not reflect the kind
of systematic reasoning we might expect an expert human to produce. In this
paper, we study how to build stronger reasoning capability in language models
using the idea of relational abstractions. We introduce new types of sequences
that more explicitly provide an abstract characterization of the transitions
through intermediate solution steps to the goal state. We find that models that
are supplied with such sequences as prompts can solve tasks with a
significantly higher accuracy, and models that are trained to produce such
sequences solve problems better than those that are trained with previously
used human-generated sequences and other baselines. Our work thus takes several
steps toward elucidating and improving how language models perform on tasks
requiring multi-step mathematical reasoning.
Related papers
- Neuro-symbolic Training for Reasoning over Spatial Language [17.901249830817882]
We propose training language models with neuro-symbolic techniques that can exploit the logical rules of reasoning as constraints.
We focus on a challenging problem of spatial reasoning over text.
arXiv Detail & Related papers (2024-06-19T20:47:36Z) - Generative Models as a Complex Systems Science: How can we make sense of
large language model behavior? [75.79305790453654]
Coaxing out desired behavior from pretrained models, while avoiding undesirable ones, has redefined NLP.
We argue for a systematic effort to decompose language model behavior into categories that explain cross-task performance.
arXiv Detail & Related papers (2023-07-31T22:58:41Z) - A Hybrid System for Systematic Generalization in Simple Arithmetic
Problems [70.91780996370326]
We propose a hybrid system capable of solving arithmetic problems that require compositional and systematic reasoning over sequences of symbols.
We show that the proposed system can accurately solve nested arithmetical expressions even when trained only on a subset including the simplest cases.
arXiv Detail & Related papers (2023-06-29T18:35:41Z) - Opening the Black Box: Analyzing Attention Weights and Hidden States in
Pre-trained Language Models for Non-language Tasks [0.8889304968879164]
We apply a pre-trained language model to constrained arithmetic problems with hierarchical structure, to analyze their attention weight scores and hidden states.
The investigation reveals promising results, with the model addressing hierarchical problems in a moderately structured manner, similar to human problem-solving strategies.
The attention analysis allows us to hypothesize that the model can generalize to longer sequences in ListOps dataset, a conclusion later confirmed through testing on sequences longer than those in the training set.
arXiv Detail & Related papers (2023-06-21T11:48:07Z) - Visual Chain of Thought: Bridging Logical Gaps with Multimodal
Infillings [61.04460792203266]
We introduce VCoT, a novel method that leverages chain-of-thought prompting with vision-language grounding to bridge the logical gaps within sequential data.
Our method uses visual guidance to generate synthetic multimodal infillings that add consistent and novel information to reduce the logical gaps for downstream tasks.
arXiv Detail & Related papers (2023-05-03T17:58:29Z) - Chaining Simultaneous Thoughts for Numerical Reasoning [92.2007997126144]
numerical reasoning over text should be an essential skill of AI systems.
Previous work focused on modeling the structures of equations, and has proposed various structured decoders.
We propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph.
arXiv Detail & Related papers (2022-11-29T18:52:06Z) - Faithful Reasoning Using Large Language Models [12.132449274592668]
We show how LMs can be made to perform faithful multi-step reasoning via a process whose causal structure mirrors the underlying logical structure of the problem.
Our approach works by chaining together reasoning steps, where each step results from calls to two fine-tuned LMs.
We demonstrate the effectiveness of our model on multi-step logical deduction and scientific question-answering, showing that it outperforms baselines on final answer accuracy.
arXiv Detail & Related papers (2022-08-30T13:44:41Z) - Learning to Reason Deductively: Math Word Problem Solving as Complex
Relation Extraction [10.721488421356053]
Solving math word problems requires deductive reasoning over the quantities in the text.
Recent research efforts mostly relied on sequence-to-sequence or sequence-to-tree models to generate expressions.
We propose a novel approach that presents explainable deductive reasoning steps to iteratively construct target expressions.
arXiv Detail & Related papers (2022-03-19T12:37:16Z) - Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought.
Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.