Reasoning Models Generate Societies of Thought
- URL: http://arxiv.org/abs/2601.10825v1
- Date: Thu, 15 Jan 2026 19:52:33 GMT
- Title: Reasoning Models Generate Societies of Thought
- Authors: Junsol Kim, Shiyang Lai, Nino Scherrer, Blaise Agüera y Arcas, James Evans,
- Abstract summary: We show that enhanced reasoning emerges from simulating multi-agent-like interactions.<n>We find that reasoning models like DeepSeek-R1 and QwQ-32B exhibit much greater perspective diversity than instruction-tuned models.
- Score: 9.112083442162671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models have achieved remarkable capabilities across domains, yet mechanisms underlying sophisticated reasoning remain elusive. Recent reasoning models outperform comparable instruction-tuned models on complex cognitive tasks, attributed to extended computation through longer chains of thought. Here we show that enhanced reasoning emerges not from extended computation alone, but from simulating multi-agent-like interactions -- a society of thought -- which enables diversification and debate among internal cognitive perspectives characterized by distinct personality traits and domain expertise. Through quantitative analysis and mechanistic interpretability methods applied to reasoning traces, we find that reasoning models like DeepSeek-R1 and QwQ-32B exhibit much greater perspective diversity than instruction-tuned models, activating broader conflict between heterogeneous personality- and expertise-related features during reasoning. This multi-agent structure manifests in conversational behaviors, including question-answering, perspective shifts, and the reconciliation of conflicting views, and in socio-emotional roles that characterize sharp back-and-forth conversations, together accounting for the accuracy advantage in reasoning tasks. Controlled reinforcement learning experiments reveal that base models increase conversational behaviors when rewarded solely for reasoning accuracy, and fine-tuning models with conversational scaffolding accelerates reasoning improvement over base models. These findings indicate that the social organization of thought enables effective exploration of solution spaces. We suggest that reasoning models establish a computational parallel to collective intelligence in human groups, where diversity enables superior problem-solving when systematically structured, which suggests new opportunities for agent organization to harness the wisdom of crowds.
Related papers
- Reasoning aligns language models to human cognition [12.07126784684808]
We introduce an active probabilistic reasoning task that cleanly separates sampling (actively acquiring evidence) from inference (integrating evidence toward a decision)<n> Benchmarking humans and a broad set of contemporary large language models against near-optimal reference policies reveals a consistent pattern.<n>This model places humans and models in a shared low-dimensional cognitive space, reproduces behavioral signatures across agents, and shows how chain-of-thought shifts language models toward human-like regimes of evidence accumulation and belief-to-choice mapping.
arXiv Detail & Related papers (2026-02-09T14:13:39Z) - Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models [50.39102836928242]
We introduce a representational perspective to investigate the dynamics of the model's internal states.<n>We discover that post-training yields only limited improvement in static initial representation quality.
arXiv Detail & Related papers (2026-01-31T15:23:33Z) - Agentic Reasoning for Large Language Models [122.81018455095999]
Reasoning is a fundamental cognitive process underlying inference, problem-solving, and decision-making.<n>Large language models (LLMs) demonstrate strong reasoning capabilities in closed-world settings, but struggle in open-ended and dynamic environments.<n>Agentic reasoning marks a paradigm shift by reframing LLMs as autonomous agents that plan, act, and learn through continual interaction.
arXiv Detail & Related papers (2026-01-18T18:58:23Z) - LLMs as Strategic Agents: Beliefs, Best Response Behavior, and Emergent Heuristics [0.0]
Large Language Models (LLMs) are increasingly applied to domains that require reasoning about other agents' behavior.<n>We show that current frontier models exhibit belief-coherent best-response behavior at targeted reasoning memorization.<n>Under increasing complexity, explicit recursion gives way to internally generated rules of choice that are stable, model-specific, and distinct from known human biases.
arXiv Detail & Related papers (2025-10-12T21:40:29Z) - Disagreements in Reasoning: How a Model's Thinking Process Dictates Persuasion in Multi-Agent Systems [49.69773210844221]
This paper challenges the prevailing hypothesis that persuasive efficacy is primarily a function of model scale.<n>Through a series of multi-agent persuasion experiments, we uncover a fundamental trade-off we term the Persuasion Duality.<n>Our findings reveal that the reasoning process in LRMs exhibits significantly greater resistance to persuasion, maintaining their initial beliefs more robustly.
arXiv Detail & Related papers (2025-09-25T12:03:10Z) - Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models [79.52467430114805]
Reasoning lies at the heart of intelligence, shaping the ability to make decisions, draw conclusions, and generalize across domains.<n>In artificial intelligence, as systems increasingly operate in open, uncertain, and multimodal environments, reasoning becomes essential for enabling robust and adaptive behavior.<n>Large Multimodal Reasoning Models (LMRMs) have emerged as a promising paradigm, integrating modalities such as text, images, audio, and video to support complex reasoning capabilities.
arXiv Detail & Related papers (2025-05-08T03:35:23Z) - Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.<n>Models may behave unreliably due to poorly explored failure modes.<n> causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z) - Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism [68.05754701230039]
We construct a symbolic multi-step reasoning task to investigate the information propagation mechanisms in Transformer models.<n>We propose a random matrix-based algorithm to enhance the model's reasoning ability.
arXiv Detail & Related papers (2024-05-24T07:41:26Z) - Probing the Moral Development of Large Language Models through Defining
Issues Test [21.108525674360898]
Our study shows that early LLMs exhibit a moral reasoning ability no better than that of a random baseline.
GPT-4, in fact, has the highest post-conventional moral reasoning score, equivalent to that of typical graduate school students.
arXiv Detail & Related papers (2023-09-23T12:17:10Z) - Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs [55.66353783572259]
Causal-Consistency Chain-of-Thought harnesses multi-agent collaboration to bolster the faithfulness and causality of foundation models.<n>Our framework demonstrates significant superiority over state-of-the-art methods through extensive and comprehensive evaluations.
arXiv Detail & Related papers (2023-08-23T04:59:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.