From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with
Small Language Models
- URL: http://arxiv.org/abs/2311.06754v1
- Date: Sun, 12 Nov 2023 06:56:21 GMT
- Title: From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with
Small Language Models
- Authors: Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Wei
Zhang
- Abstract summary: We are the first to unravel the cognitive reasoning abilities of language models.
Based on the dual process theory in cognitive science, we are the first to unravel the cognitive reasoning abilities of language models.
- Score: 25.628569338856934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reasoning is a distinctive human capacity, enabling us to address complex
problems by breaking them down into a series of manageable cognitive steps.
Yet, complex logical reasoning is still cumbersome for language models. Based
on the dual process theory in cognitive science, we are the first to unravel
the cognitive reasoning abilities of language models. Our framework employs an
iterative methodology to construct a Cognitive Tree (CogTree). The root node of
this tree represents the initial query, while the leaf nodes consist of
straightforward questions that can be answered directly. This construction
involves two main components: the implicit extraction module (referred to as
the intuitive system) and the explicit reasoning module (referred to as the
reflective system). The intuitive system rapidly generates multiple responses
by utilizing in-context examples, while the reflective system scores these
responses using comparative learning. The scores guide the intuitive system in
its subsequent generation step. Our experimental results on two popular and
challenging reasoning tasks indicate that it is possible to achieve a
performance level comparable to that of GPT-3.5 (with 175B parameters), using a
significantly smaller language model that contains fewer parameters (<=7B) than
5% of GPT-3.5.
Related papers
- Improving Arithmetic Reasoning Ability of Large Language Models through Relation Tuples, Verification and Dynamic Feedback [14.938401898546553]
We propose to use a semi-structured form to represent reasoning steps of large language models.
Specifically, we use relations, which are not only human but also machine-friendly and easier to verify than natural language.
arXiv Detail & Related papers (2024-06-25T18:21:00Z) - Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning [89.89857766491475]
We propose a complex reasoning schema over KG upon large language models (LLMs)
We augment the arbitrary first-order logical queries via binary tree decomposition to stimulate the reasoning capability of LLMs.
Experiments across widely used datasets demonstrate that LACT has substantial improvements(brings an average +5.5% MRR score) over advanced methods.
arXiv Detail & Related papers (2024-05-02T18:12:08Z) - Probabilistic Tree-of-thought Reasoning for Answering
Knowledge-intensive Complex Questions [93.40614719648386]
Large language models (LLMs) are capable of answering knowledge-intensive complex questions with chain-of-thought (CoT) reasoning.
Recent works turn to retrieving external knowledge to augment CoT reasoning.
We propose a novel approach: Probabilistic Tree-of-thought Reasoning (ProbTree)
arXiv Detail & Related papers (2023-11-23T12:52:37Z) - Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers.
LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z) - STREET: A Multi-Task Structured Reasoning and Explanation Benchmark [56.555662318619135]
We introduce a unified multi-task and multi-domain natural language reasoning and explanation benchmark.
We expect models to not only answer questions, but also produce step-by-step structured explanations describing how premises in the question are used to produce intermediate conclusions that can prove the correctness of a certain answer.
arXiv Detail & Related papers (2023-02-13T22:34:02Z) - Saliency Map Verbalization: Comparing Feature Importance Representations
from Model-free and Instruction-based Methods [6.018950511093273]
Saliency maps can explain a neural model's predictions by identifying important input features.
We formalize the underexplored task of translating saliency maps into natural language.
We compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations.
arXiv Detail & Related papers (2022-10-13T17:48:15Z) - Machine Reading, Fast and Slow: When Do Models "Understand" Language? [59.897515617661874]
We investigate the behavior of reading comprehension models with respect to two linguistic'skills': coreference resolution and comparison.
We find that for comparison (but not coreference) the systems based on larger encoders are more likely to rely on the 'right' information.
arXiv Detail & Related papers (2022-09-15T16:25:44Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.