Optimal Cost Design for Model Predictive Control
- URL: http://arxiv.org/abs/2104.11353v1
- Date: Fri, 23 Apr 2021 00:00:58 GMT
- Title: Optimal Cost Design for Model Predictive Control
- Authors: Avik Jain, Lawrence Chan, Daniel S. Brown, and Anca D. Dragan
- Abstract summary: Many robotics domains use non model control (MPC) for planning, which sets a reduced time horizon, performs optimization, and replans at every step.
In this work, we challenge the common assumption that the cost we optimize using MPC should be the same as the ground truth cost for the task (plus a terminal cost)
We propose a zeroth-order trajectory-based approach that enables us to design optimal costs for an MPC planning robot in continuous MDPs.
- Score: 30.86835688868485
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many robotics domains use some form of nonconvex model predictive control
(MPC) for planning, which sets a reduced time horizon, performs trajectory
optimization, and replans at every step. The actual task typically requires a
much longer horizon than is computationally tractable, and is specified via a
cost function that cumulates over that full horizon. For instance, an
autonomous car may have a cost function that makes a desired trade-off between
efficiency, safety, and obeying traffic laws. In this work, we challenge the
common assumption that the cost we optimize using MPC should be the same as the
ground truth cost for the task (plus a terminal cost). MPC solvers can suffer
from short planning horizons, local optima, incorrect dynamics models, and,
importantly, fail to account for future replanning ability. Thus, we propose
that in many tasks it could be beneficial to purposefully choose a different
cost function for MPC to optimize: one that results in the MPC rollout having
low ground truth cost, rather than the MPC planned trajectory. We formalize
this as an optimal cost design problem, and propose a zeroth-order
optimization-based approach that enables us to design optimal costs for an MPC
planning robot in continuous MDPs. We test our approach in an autonomous
driving domain where we find costs different from the ground truth that
implicitly compensate for replanning, short horizon, incorrect dynamics models,
and local minima issues. As an example, the learned cost incentivizes MPC to
delay its decision until later, implicitly accounting for the fact that it will
get more information in the future and be able to make a better decision. Code
and videos available at https://sites.google.com/berkeley.edu/ocd-mpc/.
Related papers
- Goal-Conditioned Terminal Value Estimation for Real-time and Multi-task Model Predictive Control [1.2687745030755995]
We develop an MPC framework with goal-conditioned terminal value learning to achieve multitask policy optimization.
We evaluate the proposed method on a bipedal inverted pendulum robot model and confirm that combining goal-conditioned terminal value learning with an upper-level trajectory planner enables real-time control.
arXiv Detail & Related papers (2024-10-07T11:19:23Z) - Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO.
We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z) - Deep Model Predictive Optimization [21.22047409735362]
A major challenge in robotics is to design robust policies which enable complex and agile behaviors in the real world.
We propose Deep Model Predictive Optimization (DMPO), which learns the inner-loop of an MPC optimization algorithm directly via experience.
DMPO can outperform the best MPC algorithm by up to 27% with fewer samples and an end-to-end policy trained with MFRL by 19%.
arXiv Detail & Related papers (2023-10-06T21:11:52Z) - Stochastic Bridges as Effective Regularizers for Parameter-Efficient
Tuning [98.27893964124829]
We propose regularizing PETs that use bridges as the regularizers (running costs) for the intermediate states.
In view of the great potential and capacity, we believe more sophisticated regularizers can be designed for PETs.
arXiv Detail & Related papers (2023-05-28T09:22:44Z) - Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision
Processes [80.89852729380425]
We propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $tilde O(dsqrtH3K)$.
Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
arXiv Detail & Related papers (2022-12-12T18:58:59Z) - Towards Optimal VPU Compiler Cost Modeling by using Neural Networks to
Infer Hardware Performances [58.720142291102135]
'VPUNN' is a neural network-based cost model trained on low-level task profiling.
It consistently outperforms the state-of-the-art cost modeling in Intel's line of VPU processors.
arXiv Detail & Related papers (2022-05-09T22:48:39Z) - Learning Model Predictive Controllers for Real-Time Ride-Hailing Vehicle
Relocation and Pricing Decisions [15.80796896560034]
Large-scale ride-hailing systems often combine real-time routing at the individual request level with a macroscopic Model Predictive Control (MPC) optimization for dynamic pricing and vehicle relocation.
This paper addresses these computational challenges by learning the MPC optimization.
The resulting machine-learning model then serves as the optimization proxy and predicts its optimal solutions.
arXiv Detail & Related papers (2021-11-05T00:52:15Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - The Value of Planning for Infinite-Horizon Model Predictive Control [0.0]
We show how the intermediate data structures used by modern planners can be interpreted as an approximate value function.
We show that this value function can be used by MPC directly, resulting in more efficient and resilient behavior at runtime.
arXiv Detail & Related papers (2021-04-07T02:21:55Z) - Blending MPC & Value Function Approximation for Efficient Reinforcement
Learning [42.429730406277315]
Model-Predictive Control (MPC) is a powerful tool for controlling complex, real-world systems.
We present a framework for improving on MPC with model-free reinforcement learning (RL)
We show that our approach can obtain performance comparable with MPC with access to true dynamics.
arXiv Detail & Related papers (2020-12-10T11:32:01Z) - Exploiting Submodular Value Functions For Scaling Up Active Perception [60.81276437097671]
In active perception tasks, agent aims to select sensory actions that reduce uncertainty about one or more hidden variables.
Partially observable Markov decision processes (POMDPs) provide a natural model for such problems.
As the number of sensors available to the agent grows, the computational cost of POMDP planning grows exponentially.
arXiv Detail & Related papers (2020-09-21T09:11:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.