Related papers: Predictive Minds: LLMs As Atypical Active Inference Agents

Related papers

Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer [65.72553715508691]
We show that large vision-language models (LVLMs) lag behind strong text-only large language models (LLMs) on tasks that require multi-step inference and compositional decision-making.<n>We propose Shared Neuron Low-Rank Fusion (SNRF), a parameter-efficient framework that transfers mature inference circuitry from LLMs to LVLMs.<n>Our results demonstrate that shared neurons form an interpretable bridge between LLMs and LVLMs, enabling low-cost transfer of inference ability into multimodal models.
arXiv Detail & Related papers (2026-02-22T06:04:05Z)
Inference-Time Reasoning Selectively Reduces Implicit Social Bias in Large Language Models [0.0]
We study how reasoning-enabled inference affects implicit bias in large language models (LLMs)<n>We find that enabling reasoning significantly reduces measured implicit bias on an IAT-style evaluation for some model classes across fifteen stereotype topics.<n>This work highlights how theory from cognitive science and psychology can complement AI evaluation research by providing methodological and interpretive frameworks.
arXiv Detail & Related papers (2026-02-04T16:44:23Z)
Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering [22.666436755894328]
Large language models (LLMs) can be controlled at inference time through prompts (in-context learning) and internal activations (activation steering)<n>This work offers a unified account of prompt-based and activation-based control of LLM behavior, and a methodology for empirically predicting the effects of these interventions.
arXiv Detail & Related papers (2025-11-01T16:46:03Z)
Reasoning and Behavioral Equilibria in LLM-Nash Games: From Mindsets to Actions [15.764094200832071]
We introduce the LLM-Nash framework, a game-theoretic model where agents select reasoning prompts to guide decision-making via Large Language Models (LLMs)<n>Unlike classical games that assume utility-maximizing agents with full rationality, this framework captures bounded rationality by modeling the reasoning process explicitly.
arXiv Detail & Related papers (2025-07-10T22:43:00Z)
ICLShield: Exploring and Mitigating In-Context Learning Backdoor Attacks [61.06621533874629]
In-context learning (ICL) has demonstrated remarkable success in large language models (LLMs)<n>In this paper, we propose, for the first time, the dual-learning hypothesis, which posits that LLMs simultaneously learn both the task-relevant latent concepts and backdoor latent concepts.<n>Motivated by these findings, we propose ICLShield, a defense mechanism that dynamically adjusts the concept preference ratio.
arXiv Detail & Related papers (2025-07-02T03:09:20Z)
Waking Up an AI: A Quantitative Framework for Prompt-Induced Phase Transition in Large Language Models [0.0]
We propose a two-part framework to investigate what underlies intuitive human thinking. A form of conceptual fusion-current LLMs showed no significant difference in responsiveness between semantically fused and non-fused prompts. Our method may help illuminate key differences in how intuition and conceptual leaps emerge in artificial versus human minds.
arXiv Detail & Related papers (2025-04-16T06:49:45Z)
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models [76.6028674686018]
We introduce thought-tracing, an inference-time reasoning algorithm to trace the mental states of agents. Our algorithm is modeled after the Bayesian theory-of-mind framework. We evaluate thought-tracing on diverse theory-of-mind benchmarks, demonstrating significant performance improvements.
arXiv Detail & Related papers (2025-02-17T15:08:50Z)
IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction [3.961279440272764]
We introduce RULEARN, a novel benchmark designed to assess the rule-learning abilities of large language models in interactive settings. We propose IDEA, a novel reasoning framework that integrates the process of Induction, Deduction, and Abduction. Our evaluation of the IDEA framework, which involves five representative LLMs, demonstrates significant improvements over the baseline.
arXiv Detail & Related papers (2024-08-19T23:37:07Z)
Metacognitive Myopia in Large Language Models [0.0]
Large Language Models (LLMs) exhibit potentially harmful biases that reinforce culturally inherent stereotypes, cloud moral judgments, or amplify positive evaluations of majority groups. We propose metacognitive myopia as a cognitive-ecological framework that can account for a conglomerate of established and emerging LLM biases. Our theoretical framework posits that a lack of the two components of metacognition, monitoring and control, causes five symptoms of metacognitive myopia in LLMs.
arXiv Detail & Related papers (2024-08-10T14:43:57Z)
Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models [51.91448005607405]
We evaluate key human ToM precursors by annotating characters' perceptions on ToMi and FANToM. We present PercepToM, a novel ToM method leveraging LLMs' strong perception inference capability while supplementing their limited perception-to-belief inference.
arXiv Detail & Related papers (2024-07-08T14:58:29Z)
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs [101.51435599249234]
We propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM) Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorization effects and chaotic memorization effects. Experiments show that the clear disentanglement of memorization effects and in-context reasoning effects enables a straightforward examination of detailed inference patterns encoded by LLMs.
arXiv Detail & Related papers (2024-05-20T08:51:03Z)
Bias Amplification in Language Model Evolution: An Iterated Learning Perspective [27.63295869974611]
We draw parallels between the behavior of Large Language Models (LLMs) and the evolution of human culture. Our approach involves leveraging Iterated Learning (IL), a Bayesian framework that elucidates how subtle biases are magnified during human cultural evolution. This paper outlines key characteristics of agents' behavior in the Bayesian-IL framework, including predictions that are supported by experimental verification.
arXiv Detail & Related papers (2024-04-04T02:01:25Z)
Explaining Large Language Models Decisions Using Shapley Values [1.223779595809275]
Large language models (LLMs) have opened up exciting possibilities for simulating human behavior and cognitive processes. However, the validity of utilizing LLMs as stand-ins for human subjects remains uncertain. This paper presents a novel approach based on Shapley values to interpret LLM behavior and quantify the relative contribution of each prompt component to the model's output.
arXiv Detail & Related papers (2024-03-29T22:49:43Z)
What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models [50.97705264224828]
We propose Counterfactual Inception, a novel method that implants counterfactual thinking into Large Multi-modal Models. We aim for the models to engage with and generate responses that span a wider contextual scene understanding. Comprehensive analyses across various LMMs, including both open-source and proprietary models, corroborate that counterfactual thinking significantly reduces hallucination.
arXiv Detail & Related papers (2024-03-20T11:27:20Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention [53.896974148579346]
Large Language Models (LLMs) have achieved unprecedented breakthroughs in various natural language processing domains. The enigmatic black-box'' nature of LLMs remains a significant challenge for interpretability, hampering transparent and accountable applications. We propose a novel methodology anchored in sparsity-guided techniques, aiming to provide a holistic interpretation of LLMs.
arXiv Detail & Related papers (2023-12-22T19:55:58Z)
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results. We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z)
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning [70.48605869773814]
Catastrophic forgetting (CF) is a phenomenon that occurs in machine learning when a model forgets previously learned information. This study empirically evaluates the forgetting phenomenon in large language models during continual instruction tuning.
arXiv Detail & Related papers (2023-08-17T02:53:23Z)
Deception Abilities Emerged in Large Language Models [0.0]
Large language models (LLMs) are currently at the forefront of intertwining artificial intelligence (AI) systems with human communication and everyday life. This study reveals that such strategies emerged in state-of-the-art LLMs, such as GPT-4, but were non-existent in earlier LLMs. We conduct a series of experiments showing that state-of-the-art LLMs are able to understand and induce false beliefs in other agents.
arXiv Detail & Related papers (2023-07-31T09:27:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.