Modeling Boundedly Rational Agents with Latent Inference Budgets
- URL: http://arxiv.org/abs/2312.04030v1
- Date: Thu, 7 Dec 2023 03:55:51 GMT
- Title: Modeling Boundedly Rational Agents with Latent Inference Budgets
- Authors: Athul Paul Jacob, Abhishek Gupta, Jacob Andreas
- Abstract summary: We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly.
L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors.
We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
- Score: 56.24971011281947
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of modeling a population of agents pursuing unknown
goals subject to unknown computational constraints. In standard models of
bounded rationality, sub-optimal decision-making is simulated by adding
homoscedastic noise to optimal decisions rather than explicitly simulating
constrained inference. In this work, we introduce a latent inference budget
model (L-IBM) that models agents' computational constraints explicitly, via a
latent variable (inferred jointly with a model of agents' goals) that controls
the runtime of an iterative inference algorithm. L-IBMs make it possible to
learn agent models using data from diverse populations of suboptimal actors. In
three modeling tasks -- inferring navigation goals from routes, inferring
communicative intents from human utterances, and predicting next moves in human
chess games -- we show that L-IBMs match or outperform Boltzmann models of
decision-making under uncertainty. Inferred inference budgets are themselves
meaningful, efficient to compute, and correlated with measures of player skill,
partner skill and task difficulty.
Related papers
- Learning-to-Defer for Extractive Question Answering [3.6787328174619254]
We introduce an adapted two-stage Learning-to-Defer mechanism that enhances decision-making by enabling selective deference to human experts or larger models without retraining language models in the context of question-answering.
Our results demonstrate that deferring a minimal number of queries allows the smaller model to achieve performance comparable to their larger counterparts while preserving computing efficiency.
arXiv Detail & Related papers (2024-10-21T08:21:00Z) - Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - Kullback-Leibler Barycentre of Stochastic Processes [0.0]
We consider the problem where an agent aims to combine the views and insights of different experts' models.
We show existence and uniqueness of the barycentre model and proof an explicit representation of the Radon-Nikodym derivative.
Two deep learning algorithms are proposed to find the optimal drift of the combined model.
arXiv Detail & Related papers (2024-07-05T20:45:27Z) - MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [80.47072100963017]
We introduce a novel and low-compute algorithm, Model Merging with Amortized Pareto Front (MAP)
MAP efficiently identifies a set of scaling coefficients for merging multiple models, reflecting the trade-offs involved.
We also introduce Bayesian MAP for scenarios with a relatively low number of tasks and Nested MAP for situations with a high number of tasks, further reducing the computational cost of evaluation.
arXiv Detail & Related papers (2024-06-11T17:55:25Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Deciding What to Model: Value-Equivalent Sampling for Reinforcement
Learning [21.931580762349096]
We introduce an algorithm that computes an approximately-value-equivalent, lossy compression of the environment which an agent may feasibly target in lieu of the true model.
We prove an information-theoretic, Bayesian regret bound for our algorithm that holds for any finite-horizon, episodic sequential decision-making problem.
arXiv Detail & Related papers (2022-06-04T23:36:38Z) - Reinforcement Learning with a Terminator [80.34572413850186]
We learn the parameters of the TerMDP and leverage the structure of the estimation problem to provide state-wise confidence bounds.
We use these to construct a provably-efficient algorithm, which accounts for termination, and bound its regret.
arXiv Detail & Related papers (2022-05-30T18:40:28Z) - Sample Complexity of Robust Reinforcement Learning with a Generative
Model [0.0]
We propose a model-based reinforcement learning (RL) algorithm for learning an $epsilon$-optimal robust policy.
We consider three different forms of uncertainty sets, characterized by the total variation distance, chi-square divergence, and KL divergence.
In addition to the sample complexity results, we also present a formal analytical argument on the benefit of using robust policies.
arXiv Detail & Related papers (2021-12-02T18:55:51Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - MASSIVE: Tractable and Robust Bayesian Learning of Many-Dimensional
Instrumental Variable Models [8.271859911016719]
We propose a general and efficient causal inference algorithm that accounts for model uncertainty.
We show that, as long as some of the candidates are (close to) valid, without knowing a priori which ones, they collectively still pose enough restrictions on the target interaction to obtain a reliable causal effect estimate.
arXiv Detail & Related papers (2020-12-18T10:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.