Related papers: Control when confidence is costly

Control when confidence is costly

URL: http://arxiv.org/abs/2406.14427v2
Date: Tue, 29 Oct 2024 18:52:56 GMT
Title: Control when confidence is costly
Authors: Itzel Olivos-Castillo, Paul Schrater, Xaq Pitkow,
Abstract summary: We develop a version of control that accounts for computational costs of inference. We study Linear Quadratic Gaussian (LQG) control with an added internal cost on the relative precision of the posterior probability over the world state. We discover that the rational strategy that solves the joint inference and control problem goes through phase transitions depending on the task demands.
Score: 4.683612295430957
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We develop a version of stochastic control that accounts for computational costs of inference. Past studies identified efficient coding without control, or efficient control that neglects the cost of synthesizing information. Here we combine these concepts into a framework where agents rationally approximate inference for efficient control. Specifically, we study Linear Quadratic Gaussian (LQG) control with an added internal cost on the relative precision of the posterior probability over the world state. This creates a trade-off: an agent can obtain more utility overall by sacrificing some task performance, if doing so saves enough bits during inference. We discover that the rational strategy that solves the joint inference and control problem goes through phase transitions depending on the task demands, switching from a costly but optimal inference to a family of suboptimal inferences related by rotation transformations, each misestimate the stability of the world. In all cases, the agent moves more to think less. This work provides a foundation for a new type of rational computations that could be used by both brains and machines for efficient but computationally constrained control.

Related papers

Communication Efficient Decentralization for Smoothed Online Convex Optimization [9.449153668916098]
We study the multi-agent Smoothed Online Convex Optimization (SOCO) problem, where $N$ agents interact through a communication graph. In each round, each agent $i$ receives a strongly convex hitting cost function $fi_t$ in an online fashion. Our results hold even when the communication graph changes arbitrarily and adaptively over time.
arXiv Detail & Related papers (2024-11-13T05:59:04Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Stochastic Optimal Control Matching [53.156277491861985]
Our work introduces Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for optimal control. The control is learned via a least squares problem by trying to fit a matching vector field. Experimentally, our algorithm achieves lower error than all the existing IDO techniques for optimal control.
arXiv Detail & Related papers (2023-12-04T16:49:43Z)
Quantization for decentralized learning under subspace constraints [61.59416703323886]
We consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints. We propose and study an adaptive decentralized strategy where the agents employ differential randomized quantizers to compress their estimates. The analysis shows that, under some general conditions on the quantization noise, the strategy is stable both in terms of mean-square error and average bit rate.
arXiv Detail & Related papers (2022-09-16T09:38:38Z)
Should All Proposals be Treated Equally in Object Detection? [110.27485090952385]
The complexity-precision trade-off of an object detector is a critical problem for resource constrained vision tasks. It is hypothesized that improved detection efficiency requires a paradigm shift, towards the unequal processing of proposals. This results in better utilization of available computational budget, enabling higher accuracy for the same FLOPS.
arXiv Detail & Related papers (2022-07-07T18:26:32Z)
Planning with Dynamically Estimated Action Costs [2.8326418377665346]
Information about action costs is critical for real-world AI planning applications. Recent approaches use black-box external action cost estimators, often learned from data, that are applied during the planning phase. We suggest a generalization of deterministic planning with action costs that allows selecting between multiple estimators for action cost.
arXiv Detail & Related papers (2022-06-08T21:10:37Z)
Efficient Online Linear Control with Stochastic Convex Costs and Unknown Dynamics [0.0]
We present a computationally efficient algorithm that attains an optimal $sqrtT$ regret-rate against the best stabilizing linear controller. In contrast to previous work, our algorithm is based on the Optimism in the Face of Uncertainty paradigm.
arXiv Detail & Related papers (2022-03-02T15:19:20Z)
An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment. We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z)
Utilizing Redundancy in Cost Functions for Resilience in Distributed Optimization and Learning [1.8414221462731502]
This paper considers the problem of resilient distributed optimization and machine learning in a server-based architecture. The system comprises a server and multiple agents, where each agent has a local cost function. We consider the case when some of the agents may be asynchronous and/or Byzantine faulty.
arXiv Detail & Related papers (2021-10-21T02:41:19Z)
Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions. Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure. In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z)
An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective. Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z)
Controlling Computation versus Quality for Neural Sequence Models [42.525463454120256]
Conditional computation makes neural sequence models (Transformers) more efficient and computation-aware during inference. We evaluate our approach on two tasks: (i) WMT English-French Translation and (ii) Unsupervised representation learning (BERT)
arXiv Detail & Related papers (2020-02-17T17:54:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.