Large Language Models as an Indirect Reasoner: Contrapositive and
Contradiction for Automated Reasoning
- URL: http://arxiv.org/abs/2402.03667v1
- Date: Tue, 6 Feb 2024 03:41:12 GMT
- Title: Large Language Models as an Indirect Reasoner: Contrapositive and
Contradiction for Automated Reasoning
- Authors: Yanfang Zhang, Yiliu Sun, Yibing Zhan, Dapeng Tao, Dacheng Tao, Chen
Gong
- Abstract summary: This paper proposes a novel Indirect Reasoning (IR) method that employs the logic of contrapositives and contradictions to tackle IR tasks such as factual reasoning and mathematic proof.
The experimental results on popular LLMs, such as GPT-3.5-turbo and Gemini-pro, show that our IR method enhances the overall accuracy of factual reasoning by 27.33% and mathematical proof by 31.43%.
- Score: 79.37150041259066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, increasing attention has been focused drawn on to improve the
ability of Large Language Models (LLMs) to perform complex reasoning. However,
previous methods, such as Chain-of-Thought and Self-Consistency, mainly follow
Direct Reasoning (DR) frameworks, so they will meet difficulty in solving
numerous real-world tasks which can hardly be solved via DR. Therefore, to
strengthen the reasoning power of LLMs, this paper proposes a novel Indirect
Reasoning (IR) method that employs the logic of contrapositives and
contradictions to tackle IR tasks such as factual reasoning and mathematic
proof. Specifically, our methodology comprises two steps. Firstly, we leverage
the logical equivalence of contrapositive to augment the data and rules to
enhance the comprehensibility of LLMs. Secondly, we design a set of prompt
templates to trigger LLMs to conduct IR based on proof by contradiction that is
logically equivalent to the original DR process. Our IR method is simple yet
effective and can be straightforwardly integrated with existing DR methods to
further boost the reasoning abilities of LLMs. The experimental results on
popular LLMs, such as GPT-3.5-turbo and Gemini-pro, show that our IR method
enhances the overall accuracy of factual reasoning by 27.33% and mathematical
proof by 31.43%, when compared with traditional DR methods. Moreover, the
methods combining IR and DR significantly outperform the methods solely using
IR or DR, further demonstrating the effectiveness of our strategy.
Related papers
- Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We show that prompting-based rationales align better with human-annotated rationales than attribution-based rationales.
We additionally find that the faithfulness limitations of prompting-based methods, which are identified in previous work, may be linked to their collapsed predictions.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Break the Chain: Large Language Models Can be Shortcut Reasoners [18.047917626825548]
Chain-of-Thought (CoT) reasoning utilize complex modules but are hampered by high token consumption, limited applicability, and challenges in thinking.
This paper conducts a critical evaluation of CoT prompting, extending beyond arithmetic to include complex logical and commonsense reasoning tasks.
We propose the integration of human-likes and shortcuts into language models (LMs) through "break the chain" strategies.
arXiv Detail & Related papers (2024-06-04T14:02:53Z) - FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering [46.41364317172677]
We propose a retrieval-exploration interactive method, FiDelis, to handle intermediate steps of reasoning grounded by external knowledge graphs.
We incorporate the logic and common-sense reasoning of LLMs into the knowledge retrieval process, which provides more accurate recalling performance.
arXiv Detail & Related papers (2024-05-22T17:56:53Z) - Generating Chain-of-Thoughts with a Pairwise-Comparison Approach to Searching for the Most Promising Intermediate Thought [70.30423016640749]
Chain-of-thoughts (CoT) methods were proposed to guide large language models to reason step-by-step, enabling problem solving from simple to complex.
The evaluation from the large language model (LLMs) is typically noisy and unreliable, potentially misleading the generation process in selecting promising intermediate thoughts.
In this paper, motivated by Vapnik's principle, we use pairwise-comparison evaluation instead of point-wise scoring to search for promising intermediate thoughts.
arXiv Detail & Related papers (2024-02-10T09:51:03Z) - LLMs for Relational Reasoning: How Far are We? [8.840750655261251]
Large language models (LLMs) have revolutionized many areas by achieving state-of-the-art performance on downstream tasks.
Recent efforts have demonstrated that the LLMs are poor at solving sequential decision-making problems.
arXiv Detail & Related papers (2024-01-17T08:22:52Z) - DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy [76.58614128865652]
We propose DetermLR, a novel perspective that rethinks the reasoning process as an evolution from indeterminacy to determinacy.
First, we categorize known conditions into two types: determinate and indeterminate premises This provides an oveall direction for the reasoning process and guides LLMs in converting indeterminate data into progressively determinate insights.
We automate the storage and extraction of available premises and reasoning paths with reasoning memory, preserving historical reasoning details for subsequent reasoning steps.
arXiv Detail & Related papers (2023-10-28T10:05:51Z) - Concise and Organized Perception Facilitates Reasoning in Large Language Models [32.71672086718057]
We show that large language models (LLMs) exhibit failure patterns akin to human-like cognitive biases when dealing with disordered and irrelevant content in reasoning tasks.
We propose a novel reasoning approach named Concise and Organized Perception (COP)
COP carefully analyzes the given statements to identify the most pertinent information while eliminating redundancy efficiently.
arXiv Detail & Related papers (2023-10-05T04:47:49Z) - Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems.
LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning.
We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z) - Exploring Self-supervised Logic-enhanced Training for Large Language Models [59.227222647741094]
In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training.
We devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion.
The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM.
arXiv Detail & Related papers (2023-05-23T06:13:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.