Related papers: Learning to Reason With Relational Abstractions

Learning to Reason With Relational Abstractions

URL: http://arxiv.org/abs/2210.02615v1
Date: Thu, 6 Oct 2022 00:27:50 GMT
Title: Learning to Reason With Relational Abstractions
Authors: Andrew J. Nam, Mengye Ren, Chelsea Finn, James L. McClelland
Abstract summary: We study how to build stronger reasoning capability in language models using the idea of relational abstractions. We find that models that are supplied with such sequences as prompts can solve tasks with a significantly higher accuracy.
Score: 65.89553417442049
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large language models have recently shown promising progress in mathematical reasoning when fine-tuned with human-generated sequences walking through a sequence of solution steps. However, the solution sequences are not formally structured and the resulting model-generated sequences may not reflect the kind of systematic reasoning we might expect an expert human to produce. In this paper, we study how to build stronger reasoning capability in language models using the idea of relational abstractions. We introduce new types of sequences that more explicitly provide an abstract characterization of the transitions through intermediate solution steps to the goal state. We find that models that are supplied with such sequences as prompts can solve tasks with a significantly higher accuracy, and models that are trained to produce such sequences solve problems better than those that are trained with previously used human-generated sequences and other baselines. Our work thus takes several steps toward elucidating and improving how language models perform on tasks requiring multi-step mathematical reasoning.

Related papers

Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability [18.54202114336492]
We investigate the approximation, learnability, and generalization properties of multi-round auto-regressive models. We show that Transformers with finite context windows are universal approximators for steps of Turing-computable functions. We extend PAC learning to sequence generation and demonstrate that multi-round generation is learnable even when the sequence length exceeds the model's context window.
arXiv Detail & Related papers (2025-03-05T02:50:55Z)
Neuro-symbolic Training for Reasoning over Spatial Language [17.901249830817882]
We propose training language models with neuro-symbolic techniques that can exploit the logical rules of reasoning as constraints. We focus on a challenging problem of spatial reasoning over text.
arXiv Detail & Related papers (2024-06-19T20:47:36Z)
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior? [75.79305790453654]
Coaxing out desired behavior from pretrained models, while avoiding undesirable ones, has redefined NLP. We argue for a systematic effort to decompose language model behavior into categories that explain cross-task performance.
arXiv Detail & Related papers (2023-07-31T22:58:41Z)
A Hybrid System for Systematic Generalization in Simple Arithmetic Problems [70.91780996370326]
We propose a hybrid system capable of solving arithmetic problems that require compositional and systematic reasoning over sequences of symbols. We show that the proposed system can accurately solve nested arithmetical expressions even when trained only on a subset including the simplest cases.
arXiv Detail & Related papers (2023-06-29T18:35:41Z)
Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks [0.8889304968879164]
We apply a pre-trained language model to constrained arithmetic problems with hierarchical structure, to analyze their attention weight scores and hidden states. The investigation reveals promising results, with the model addressing hierarchical problems in a moderately structured manner, similar to human problem-solving strategies. The attention analysis allows us to hypothesize that the model can generalize to longer sequences in ListOps dataset, a conclusion later confirmed through testing on sequences longer than those in the training set.
arXiv Detail & Related papers (2023-06-21T11:48:07Z)
Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings [61.04460792203266]
We introduce VCoT, a novel method that leverages chain-of-thought prompting with vision-language grounding to bridge the logical gaps within sequential data. Our method uses visual guidance to generate synthetic multimodal infillings that add consistent and novel information to reduce the logical gaps for downstream tasks.
arXiv Detail & Related papers (2023-05-03T17:58:29Z)
Chaining Simultaneous Thoughts for Numerical Reasoning [92.2007997126144]
numerical reasoning over text should be an essential skill of AI systems. Previous work focused on modeling the structures of equations, and has proposed various structured decoders. We propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph.
arXiv Detail & Related papers (2022-11-29T18:52:06Z)
Faithful Reasoning Using Large Language Models [12.132449274592668]
We show how LMs can be made to perform faithful multi-step reasoning via a process whose causal structure mirrors the underlying logical structure of the problem. Our approach works by chaining together reasoning steps, where each step results from calls to two fine-tuned LMs. We demonstrate the effectiveness of our model on multi-step logical deduction and scientific question-answering, showing that it outperforms baselines on final answer accuracy.
arXiv Detail & Related papers (2022-08-30T13:44:41Z)
Learning to Reason Deductively: Math Word Problem Solving as Complex Relation Extraction [10.721488421356053]
Solving math word problems requires deductive reasoning over the quantities in the text. Recent research efforts mostly relied on sequence-to-sequence or sequence-to-tree models to generate expressions. We propose a novel approach that presents explainable deductive reasoning steps to iteratively construct target expressions.
arXiv Detail & Related papers (2022-03-19T12:37:16Z)
Chain of Thought Prompting Elicits Reasoning in Large Language Models [56.811278668446825]
This paper explores the ability of language models to generate a coherent chain of thought. Experiments show that inducing a chain of thought via prompting can enable sufficiently large language models to better perform reasoning tasks.
arXiv Detail & Related papers (2022-01-28T02:33:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.