PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
- URL: http://arxiv.org/abs/2211.01562v3
- Date: Thu, 6 Apr 2023 23:49:35 GMT
- Title: PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
- Authors: Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren
- Abstract summary: PINTO is a pipeline that rationalizes via prompt-based learning and learns to faithfully reason over rationales via counterfactual regularization.
We show that PINTO significantly improves the ability of the reasoning LM, yielding higher performance on both in-distribution and out-of-distribution test sets.
- Score: 42.98229290301891
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural language models (LMs) have achieved impressive results on various
language-based reasoning tasks by utilizing latent knowledge encoded in their
own pretrained parameters. To make this reasoning process more explicit, recent
works retrieve a rationalizing LM's internal knowledge by training or prompting
it to generate free-text rationales, which can be used to guide task
predictions made by either the same LM or a separate reasoning LM. However,
rationalizing LMs require expensive rationale annotation and/or computation,
without any assurance that their generated rationales improve LM task
performance or faithfully reflect LM decision-making. In this paper, we propose
PINTO, an LM pipeline that rationalizes via prompt-based learning, and learns
to faithfully reason over rationales via counterfactual regularization. First,
PINTO maps out a suitable reasoning process for the task input by prompting a
frozen rationalizing LM to generate a free-text rationale. Second, PINTO's
reasoning LM is fine-tuned to solve the task using the generated rationale as
context, while regularized to output less confident predictions when the
rationale is perturbed. Across four datasets, we show that PINTO significantly
improves the generalization ability of the reasoning LM, yielding higher
performance on both in-distribution and out-of-distribution test sets. Also, we
find that PINTO's rationales are more faithful to its task predictions than
those generated by competitive baselines.
Related papers
- Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought [51.240387516059535]
We introduce a novel framework, LM-Guided CoT, that leverages a lightweight (i.e., 1B) language model (LM) for guiding a black-box large (i.e., >10B) LM in reasoning tasks.
We optimize the model through 1) knowledge distillation and 2) reinforcement learning from rationale-oriented and task-oriented reward signals.
arXiv Detail & Related papers (2024-04-04T12:46:37Z) - Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs [95.07757789781213]
Two lines of approaches are adopted for complex reasoning with LLMs.
One line of work prompts LLMs with various reasoning structures, while the structural outputs can be naturally regarded as intermediate reasoning steps.
The other line of work adopt LLM-free declarative solvers to do the reasoning task, rendering higher reasoning accuracy but lacking interpretability due to the black-box nature of the solvers.
We present a simple extension to the latter line of work. Specifically, we showcase that the intermediate search logs generated by Prolog interpreters can be accessed and interpreted into human-readable reasoning.
arXiv Detail & Related papers (2023-11-16T11:26:21Z) - Are LLMs Rigorous Logical Reasoner? Empowering Natural Language Proof
Generation with Contrastive Stepwise Decoding [11.385103498440932]
We introduce contrastive decoding to stepwise proof generation, making use of negative reasoning paths to strengthen the model's capacity for logical deduction.
Experiments on EntailmentBank underscore the success of our method in augmenting the proof planning abilities of language models.
arXiv Detail & Related papers (2023-11-12T05:12:49Z) - Characterizing Large Language Models as Rationalizers of
Knowledge-intensive Tasks [6.51301154858045]
Large language models (LLMs) are proficient at generating fluent text with minimal task-specific supervision.
We consider the task of generating knowledge-guided rationalization in natural language by using expert-written examples in a few-shot manner.
Surprisingly, crowd-workers preferred knowledge-grounded rationales over crowdsourced rationalizations, citing their factuality, sufficiency, and comprehensive refutations.
arXiv Detail & Related papers (2023-11-09T01:04:44Z) - DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy [76.58614128865652]
We propose DetermLR, a novel perspective that rethinks the reasoning process as an evolution from indeterminacy to determinacy.
First, we categorize known conditions into two types: determinate and indeterminate premises This provides an oveall direction for the reasoning process and guides LLMs in converting indeterminate data into progressively determinate insights.
We automate the storage and extraction of available premises and reasoning paths with reasoning memory, preserving historical reasoning details for subsequent reasoning steps.
arXiv Detail & Related papers (2023-10-28T10:05:51Z) - Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems.
LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning.
We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z) - SCOTT: Self-Consistent Chain-of-Thought Distillation [68.40232422158569]
Large language models (LMs) generate free-text rationales for their predictions via chain-of-thought prompting.
We propose a faithful knowledge distillation method to learn a small, self-consistent CoT model from a teacher model that is orders of magnitude larger.
To ensure faithful distillation, we use the teacher-generated rationales to learn a student LM with a counterfactual reasoning objective.
arXiv Detail & Related papers (2023-05-03T03:47:00Z) - KNIFE: Distilling Reasoning Knowledge From Free-Text Rationales [31.28256104334867]
We propose KNIFE, which shows that reasoning knowledge can be effectively distilled from FTRs into a small (1B) LM.
KNIFE finetunes a teacher LM (given task input and FTR) to predict the task output, transferring reasoning knowledge from the FTRs to the teacher's hidden states.
Second, KNIFE finetunes a student LM (given task input only) such that its hidden states are aligned with the teacher's.
arXiv Detail & Related papers (2022-12-19T18:49:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.