Active Inference for Stochastic Control
- URL: http://arxiv.org/abs/2108.12245v1
- Date: Fri, 27 Aug 2021 12:51:42 GMT
- Title: Active Inference for Stochastic Control
- Authors: Aswin Paul, Noor Sajid, Manoj Gopalkrishnan, and Adeel Razi
- Abstract summary: Active inference has emerged as an alternative approach to control problems given its intuitive (probabilistic) formalism.
We build upon this work to assess the utility of active inference for a control setting.
Our results demonstrate the advantage of using active inference, compared to reinforcement learning, in both deterministic and partial observability.
- Score: 1.3124513975412255
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Active inference has emerged as an alternative approach to control problems
given its intuitive (probabilistic) formalism. However, despite its theoretical
utility, computational implementations have largely been restricted to
low-dimensional, deterministic settings. This paper highlights that this is a
consequence of the inability to adequately model stochastic transition
dynamics, particularly when an extensive policy (i.e., action trajectory) space
must be evaluated during planning. Fortunately, recent advancements propose a
modified planning algorithm for finite temporal horizons. We build upon this
work to assess the utility of active inference for a stochastic control
setting. For this, we simulate the classic windy grid-world task with
additional complexities, namely: 1) environment stochasticity; 2) learning of
transition dynamics; and 3) partial observability. Our results demonstrate the
advantage of using active inference, compared to reinforcement learning, in
both deterministic and stochastic settings.
Related papers
- Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems.
In common practice, convergence (hyper)policies are learned only to deploy their deterministic version.
We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z) - Deep hybrid models: infer and plan in the real world [0.0]
We present an effective solution, based on active inference, to complex control tasks.
The proposed architecture exploits hybrid (discrete and continuous) processing to construct a hierarchical and dynamic representation of the self and the environment.
We evaluate this deep hybrid model on a non-trivial task: reaching a moving object after having picked a moving tool.
arXiv Detail & Related papers (2024-02-01T15:15:25Z) - Learning From Scenarios for Stochastic Repairable Scheduling [3.9948520633731026]
We show how decision-focused learning techniques based on smoothing can be adapted to a scheduling problem.
We include an experimental evaluation to investigate in which situations decision-focused learning outperforms the state of the art for such situations: scenario-based optimization.
arXiv Detail & Related papers (2023-12-06T13:32:17Z) - Actively Learning Reinforcement Learning: A Stochastic Optimal Control Approach [3.453622106101339]
We propose a framework towards achieving two intertwined objectives: (i) equipping reinforcement learning with active exploration and deliberate information gathering, and (ii) overcoming the computational intractability of optimal control law.
We approach both objectives by using reinforcement learning to compute the optimal control law.
Unlike fixed exploration and exploitation balance, caution and probing are employed automatically by the controller in real-time, even after the learning process is terminated.
arXiv Detail & Related papers (2023-09-18T18:05:35Z) - Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level
Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling.
We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z) - On efficient computation in active inference [1.1470070927586016]
We present a novel planning algorithm for finite temporal horizons with drastically lower computational complexity.
We also simplify the process of setting an appropriate target distribution for new and existing active inference planning schemes.
arXiv Detail & Related papers (2023-07-02T07:38:56Z) - Resilient Constrained Learning [94.27081585149836]
This paper presents a constrained learning approach that adapts the requirements while simultaneously solving the learning task.
We call this approach resilient constrained learning after the term used to describe ecological systems that adapt to disruptions by modifying their operation.
arXiv Detail & Related papers (2023-06-04T18:14:18Z) - Robust Value Iteration for Continuous Control Tasks [99.00362538261972]
When transferring a control policy from simulation to a physical system, the policy needs to be robust to variations in the dynamics to perform well.
We present Robust Fitted Value Iteration, which uses dynamic programming to compute the optimal value function on the compact state domain.
We show that robust value is more robust compared to deep reinforcement learning algorithm and the non-robust version of the algorithm.
arXiv Detail & Related papers (2021-05-25T19:48:35Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - Optimizing for the Future in Non-Stationary MDPs [52.373873622008944]
We present a policy gradient algorithm that maximizes a forecast of future performance.
We show that our algorithm, called Prognosticator, is more robust to non-stationarity than two online adaptation techniques.
arXiv Detail & Related papers (2020-05-17T03:41:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.