Reasoning on a Spectrum: Aligning LLMs to System 1 and System 2 Thinking
- URL: http://arxiv.org/abs/2502.12470v1
- Date: Tue, 18 Feb 2025 02:58:37 GMT
- Title: Reasoning on a Spectrum: Aligning LLMs to System 1 and System 2 Thinking
- Authors: Alireza S. Ziabari, Nona Ghazizadeh, Zhivar Sourati, Farzan Karimi-Malekabadi, Payam Piray, Morteza Dehghani,
- Abstract summary: Large Language Models (LLMs) exhibit impressive reasoning abilities, yet their reliance on structured step-by-step reasoning reveals a critical limitation.
This work challenges the assumption that step-by-step reasoning is always optimal and highlights the need for adapting reasoning strategies based on task demands.
- Score: 0.9709444454602557
- License:
- Abstract: Large Language Models (LLMs) exhibit impressive reasoning abilities, yet their reliance on structured step-by-step processing reveals a critical limitation. While human cognition fluidly adapts between intuitive, heuristic (System 1) and analytical, deliberative (System 2) reasoning depending on the context, LLMs lack this dynamic flexibility. This rigidity can lead to brittle and unreliable performance when faced with tasks that deviate from their trained patterns. To address this, we create a dataset of 2,000 samples with valid System 1 and System 2 answers, explicitly align LLMs with these reasoning styles, and evaluate their performance across reasoning benchmarks. Our results reveal an accuracy-efficiency trade-off: System 2-aligned models excel in arithmetic and symbolic reasoning, while System 1-aligned models perform better in commonsense tasks. A mechanistic analysis of model responses shows that System 1 models employ more definitive answers, whereas System 2 models demonstrate greater uncertainty. Interpolating between these extremes produces a monotonic transition in reasoning accuracy, preserving coherence. This work challenges the assumption that step-by-step reasoning is always optimal and highlights the need for adapting reasoning strategies based on task demands.
Related papers
- Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving [55.895917967408586]
Existing approaches to mathematical reasoning with large language models rely on Chain-of-Thought (CoT) for generalizability or Tool-Integrated Reasoning (TIR) for precise computation.
We propose TATA (Teaching LLMs According to Their Aptitude), an adaptive framework that enables LLMs to personalize their reasoning strategy spontaneously.
arXiv Detail & Related papers (2025-02-17T16:56:23Z) - LLM2: Let Large Language Models Harness System 2 Reasoning [65.89293674479907]
Large language models (LLMs) have exhibited impressive capabilities across a myriad of tasks, yet they occasionally yield undesirable outputs.
We introduce LLM2, a novel framework that combines an LLM with a process-based verifier.
LLMs2 is responsible for generating plausible candidates, while the verifier provides timely process-based feedback to distinguish desirable and undesirable outputs.
arXiv Detail & Related papers (2024-12-29T06:32:36Z) - Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning [52.83539473110143]
We introduce a novel structure-oriented analysis method to help Large Language Models (LLMs) better understand a question.
To further improve the reliability in complex question-answering tasks, we propose a multi-agent reasoning system, Structure-oriented Autonomous Reasoning Agents (SARA)
Extensive experiments verify the effectiveness of the proposed reasoning system. Surprisingly, in some cases, the system even surpasses few-shot methods.
arXiv Detail & Related papers (2024-10-18T05:30:33Z) - Deconfounded Causality-aware Parameter-Efficient Fine-Tuning for Problem-Solving Improvement of LLMs [12.48241058167222]
Large Language Models (LLMs) have demonstrated remarkable efficiency in tackling various tasks based on human instructions.
But studies reveal that they often struggle with tasks requiring reasoning, such as math or physics limitation.
This raises questions about whether LLMs truly comprehend embedded knowledge or merely learn to replicate the token distribution without a true understanding of the content.
We propose Decon Causal Adaptation (DCA), a novel parameter-efficient fine-tuning (PEFT) method to enhance the model's reasoning capabilities.
arXiv Detail & Related papers (2024-09-04T13:17:09Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models [25.537725151112387]
We introduce AutoRace for fully automated reasoning chain evaluation.
We also develop LLM Reasoners, a library for standardized modular implementation of existing and new reasoning algorithms.
arXiv Detail & Related papers (2024-04-08T06:35:09Z) - On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks [17.329365493094542]
We present a principled empirical study of the performance of GPT-4 in three domains: Game of 24, Graph Coloring, and STRIPS planning.
We observe significant performance collapse with self-critique and significant performance gains with sound external verification.
arXiv Detail & Related papers (2024-02-12T23:11:01Z) - A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning [73.77088902676306]
We take a closer look at the self-verification abilities of large language models (LLMs) in the context of logical reasoning.
Our main findings suggest that existing LLMs could struggle to identify fallacious reasoning steps accurately and may fall short of guaranteeing the validity of self-verification methods.
arXiv Detail & Related papers (2023-11-14T07:13:10Z) - Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models [61.28463542324576]
Vision-language models (VLMs) have recently demonstrated strong efficacy as visual assistants that can generate human-like outputs.
We evaluate existing state-of-the-art VLMs and find that even the best-performing model is unable to demonstrate strong visual reasoning capabilities and consistency.
We propose a two-stage training framework aimed at improving both the reasoning performance and consistency of VLMs.
arXiv Detail & Related papers (2023-09-08T17:49:44Z) - Faithful Reasoning Using Large Language Models [12.132449274592668]
We show how LMs can be made to perform faithful multi-step reasoning via a process whose causal structure mirrors the underlying logical structure of the problem.
Our approach works by chaining together reasoning steps, where each step results from calls to two fine-tuned LMs.
We demonstrate the effectiveness of our model on multi-step logical deduction and scientific question-answering, showing that it outperforms baselines on final answer accuracy.
arXiv Detail & Related papers (2022-08-30T13:44:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.