Towards Machine Theory of Mind with Large Language Model-Augmented Inverse Planning
- URL: http://arxiv.org/abs/2507.03682v1
- Date: Fri, 04 Jul 2025 16:01:27 GMT
- Title: Towards Machine Theory of Mind with Large Language Model-Augmented Inverse Planning
- Authors: Rebekah A. GelpĂ, Eric Xue, William A. Cunningham,
- Abstract summary: We propose a hybrid approach to machine Theory of Mind (ToM) that uses large language models (LLMs) as a mechanism for generating hypotheses and likelihood functions.<n>We also exhibit the model's potential to predict mental states on open-ended tasks.
- Score: 0.022940141855172035
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We propose a hybrid approach to machine Theory of Mind (ToM) that uses large language models (LLMs) as a mechanism for generating hypotheses and likelihood functions with a Bayesian inverse planning model that computes posterior probabilities for an agent's likely mental states given its actions. Bayesian inverse planning models can accurately predict human reasoning on a variety of ToM tasks, but these models are constrained in their ability to scale these predictions to scenarios with a large number of possible hypotheses and actions. Conversely, LLM-based approaches have recently demonstrated promise in solving ToM benchmarks, but can exhibit brittleness and failures on reasoning tasks even when they pass otherwise structurally identical versions. By combining these two methods, this approach leverages the strengths of each component, closely matching optimal results on a task inspired by prior inverse planning models and improving performance relative to models that utilize LLMs alone or with chain-of-thought prompting, even with smaller LLMs that typically perform poorly on ToM tasks. We also exhibit the model's potential to predict mental states on open-ended tasks, offering a promising direction for future development of ToM models and the creation of socially intelligent generative agents.
Related papers
- SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model [88.04128601981145]
We introduce SimuRA, a goal-oriented architecture for generalized agentic reasoning.<n>modelname overcomes the limitations of autoregressive reasoning by introducing a world model for planning via simulation.<n>World-model-based planning, in particular, shows consistent advantage of up to 124% over autoregressive planning.
arXiv Detail & Related papers (2025-07-31T17:57:20Z) - Learning What Matters: Probabilistic Task Selection via Mutual Information for Model Finetuning [20.93518809718398]
We introduce TASKPGM, a principled and scalable framework for mixture optimization.<n> TASKPGM selects continuous task proportions by minimizing an energy function over a Markov Random Field (MRF)<n>Our method yields a closed form solution under simplex constraints and provably balances representativeness and diversity among tasks.
arXiv Detail & Related papers (2025-07-16T20:14:55Z) - Overcoming Multi-step Complexity in Multimodal Theory-of-Mind Reasoning: A Scalable Bayesian Planner [32.33827730707331]
We propose a scalable Bayesian ToM planner that decomposes ToM reasoning into stepwise Bayesian updates.<n>Our framework introduces weak-to-strong control, allowing smaller language models to specialize in ToM-specific likelihood estimation.<n>Our method achieves a 4.6% accuracy improvement over state-of-the-art techniques on multimodal ToM benchmarks.
arXiv Detail & Related papers (2025-06-02T04:23:45Z) - AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling [8.034600950988535]
AutoToM is an automated agent modeling method for scalable, robust, and interpretable mental inference.<n>We show that AutoToM can produce human-like confidence estimates and enable online mental inference for embodied decision-making.
arXiv Detail & Related papers (2025-02-21T18:57:52Z) - BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning [78.63421517563056]
Large Language Models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks.<n>We present a unified probabilistic framework that formalizes LLM reasoning through a novel graphical model.<n>We introduce the Bootstrapping Reinforced Thinking Process (BRiTE) algorithm, which works in two steps.
arXiv Detail & Related papers (2025-01-31T02:39:07Z) - On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - Deliberate Reasoning in Language Models as Structure-Aware Planning with an Accurate World Model [14.480267340831542]
Structure-aware Planning with an Accurate World Model (SWAP)<n>SWAP integrates structured knowledge representation with learned planning.<n>We evaluate SWAP across diverse reasoning-intensive benchmarks including math reasoning, logical reasoning, and coding tasks.
arXiv Detail & Related papers (2024-10-04T04:23:36Z) - Investigating the Impact of Model Complexity in Large Language Models [3.7919508292745676]
Large Language Models (LLMs) based on the pre-trained fine-tuning paradigm have become pivotal in solving natural language processing tasks.
In this paper, we focus on autoregressive LLMs and propose to employ Hidden Markov Models (HMMs) to model them.
arXiv Detail & Related papers (2024-10-01T13:53:44Z) - MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [80.47072100963017]
We introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP)<n>MAP efficiently identifies a set of scaling coefficients for merging multiple models, reflecting the trade-offs involved.<n>We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
arXiv Detail & Related papers (2024-06-11T17:55:25Z) - Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition [8.919069368217594]
Theory of Mind (ToM) modelling can improve missing knowledge prediction in settings with asymmetric skill-sets and knowledge.
We show that, as performance on CPA nearly doubles when predicting one's own missing knowledge, the improvements due to ToM modelling diminish.
arXiv Detail & Related papers (2024-05-21T09:23:39Z) - Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA)
Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space.
We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z) - Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.