SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning
- URL: http://arxiv.org/abs/2401.13246v3
- Date: Tue, 4 Jun 2024 06:14:56 GMT
- Title: SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning
- Authors: Guoxin Chen, Kexin Tang, Chao Yang, Fuying Ye, Yu Qiao, Yiming Qian,
- Abstract summary: We propose SEER, a novel method that maximizes a structure-based return to facilitate structured reasoning and explanation.
Our proposed structure-based return precisely describes the hierarchical and branching structure inherent in structured reasoning.
Our experiments show that SEER significantly outperforms state-of-the-art methods.
- Score: 29.514755268807868
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning through supervised learning, ignoring logical dependencies between steps. Moreover, existing reinforcement learning (RL) based methods overlook the structured relationships, underutilizing the potential of RL in structured reasoning. In this paper, we propose SEER, a novel method that maximizes a structure-based return to facilitate structured reasoning and explanation. Our proposed structure-based return precisely describes the hierarchical and branching structure inherent in structured reasoning, effectively capturing the intricate relationships between different reasoning steps. In addition, we introduce a fine-grained reward function to meticulously delineate diverse reasoning steps. Extensive experiments show that SEER significantly outperforms state-of-the-art methods, achieving an absolute improvement of 6.9% over RL-based methods on EntailmentBank, a 4.4% average improvement on STREET benchmark, and exhibiting outstanding efficiency and cross-dataset generalization performance. Our code is available at https://github.com/Chen-GX/SEER.
Related papers
- Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models [84.15513004135576]
Current research enhances the reasoning performance of Large Language Models (LLMs) by sampling multiple reasoning chains and ensembling based on the answer frequency.
This approach fails in scenarios where the correct answers are in the minority.
We introduce a hierarchical reasoning aggregation framework AoR, which selects answers based on the evaluation of reasoning chains.
arXiv Detail & Related papers (2024-05-21T17:12:19Z) - Enhancing Systematic Decompositional Natural Language Inference Using
Informal Logic [53.363888563647976]
We develop a consistent and theoretically grounded approach to annotating decompositional entailment datasets.
We find that our resulting dataset, RDTE, has a substantially higher internal consistency (+9%) than prior decompositional entailment datasets.
We also find that training an RDTE-oriented entailment classifier via knowledge distillation and employing it in a modern neuro-symbolic reasoning engine significantly improves results.
arXiv Detail & Related papers (2024-02-22T18:55:17Z) - Learning by Doing: An Online Causal Reinforcement Learning Framework
with Causal-Aware Policy [40.33036146207819]
We consider explicitly modeling the generation process of states with the graphical causal model.
We formulate the causal structure updating into the RL interaction process with active intervention learning of the environment.
arXiv Detail & Related papers (2024-02-07T14:09:34Z) - Self-Discover: Large Language Models Self-Compose Reasoning Structures [136.48389510481758]
We introduce SELF-DISCOVER, a framework for self-discovering task-intrinsic reasoning structures.
SELF-DISCOVER substantially improves GPT-4 and PaLM 2's performance on challenging reasoning benchmarks.
We show that the self-discovered reasoning structures are universally applicable across model families.
arXiv Detail & Related papers (2024-02-06T01:13:53Z) - Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing [61.98556945939045]
We propose a framework to learn planning-based reasoning through Direct Preference Optimization (DPO) on collected trajectories.
Our results on challenging logical reasoning benchmarks demonstrate the effectiveness of our learning framework.
arXiv Detail & Related papers (2024-02-01T15:18:33Z) - Leveraging Structured Information for Explainable Multi-hop Question
Answering and Reasoning [14.219239732584368]
In this work, we investigate constructing and leveraging extracted semantic structures (graphs) for multi-hop question answering.
Empirical results and human evaluations show that our framework: generates more faithful reasoning chains and substantially improves the QA performance on two benchmark datasets.
arXiv Detail & Related papers (2023-11-07T05:32:39Z) - Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure [66.33623392497599]
We show that a structure called template-content structure (T-C structure) can reduce the possible space from exponential level to linear level.
We demonstrate that models can achieve task composition, further reducing the space needed to learn from linear to logarithmic.
arXiv Detail & Related papers (2023-10-09T06:57:45Z) - Modeling Hierarchical Reasoning Chains by Linking Discourse Units and
Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning.
Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z) - Unifying Structure Reasoning and Language Model Pre-training for Complex
Reasoning [26.811507121199323]
This paper proposes a unified learning framework that combines explicit structure reasoning and language pre-training to endow PLMs with the structure reasoning skill.
It first identifies several elementary structures within contexts to construct structured queries and performs step-by-step reasoning along the queries to identify the answer entity.
Experimental results on four datasets demonstrate that the proposed model achieves significant improvements in complex reasoning tasks involving diverse structures.
arXiv Detail & Related papers (2023-01-21T08:18:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.