Model Adaptation for Time Constrained Embodied Control
- URL: http://arxiv.org/abs/2406.11128v1
- Date: Mon, 17 Jun 2024 01:07:30 GMT
- Title: Model Adaptation for Time Constrained Embodied Control
- Authors: Jaehyun Song, Minjong Yoo, Honguk Woo,
- Abstract summary: We present MoDeC, a time constraint-aware embodied control framework using the modular model adaptation.
We formulate model adaptation to varying operational conditions on resource and time restrictions as dynamic routing on a modular network, incorporating these conditions as part of multi-task objectives.
Our evaluation across several vision-based embodied environments demonstrates the robustness of MoDeC, showing that it outperforms other model adaptation methods in both performance and adherence to time constraints in robotic manipulation and autonomous driving applications.
- Score: 6.876580618014666
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When adopting a deep learning model for embodied agents, it is required that the model structure be optimized for specific tasks and operational conditions. Such optimization can be static such as model compression or dynamic such as adaptive inference. Yet, these techniques have not been fully investigated for embodied control systems subject to time constraints, which necessitate sequential decision-making for multiple tasks, each with distinct inference latency limitations. In this paper, we present MoDeC, a time constraint-aware embodied control framework using the modular model adaptation. We formulate model adaptation to varying operational conditions on resource and time restrictions as dynamic routing on a modular network, incorporating these conditions as part of multi-task objectives. Our evaluation across several vision-based embodied environments demonstrates the robustness of MoDeC, showing that it outperforms other model adaptation methods in both performance and adherence to time constraints in robotic manipulation and autonomous driving applications
Related papers
- Automated Conversion of Static to Dynamic Scheduler via Natural Language [3.4748713192043876]
We propose a Retrieval-Augmented Generation (RAG) based LLM model to automate the process of implementing constraints for Dynamic Scheduling (RAGDyS)
Our framework aims to minimize technical complexities related to mathematical modelling and computational workload for end-users.
arXiv Detail & Related papers (2024-05-08T04:07:38Z) - Correct-by-Construction Control for Stochastic and Uncertain Dynamical
Models via Formal Abstractions [44.99833362998488]
We develop an abstraction framework that can be used to solve this problem under various modeling assumptions.
We use state-of-the-art verification techniques to compute an optimal policy on the iMDP with guarantees for satisfying the given specification.
We then show that, by construction, we can refine this policy into a feedback controller for which these guarantees carry over to the dynamical model.
arXiv Detail & Related papers (2023-11-16T11:03:54Z) - Sim-to-Real Transfer of Adaptive Control Parameters for AUV
Stabilization under Current Disturbance [1.099532646524593]
This paper presents a novel approach, merging the Maximum Entropy Deep Reinforcement Learning framework with a classic model-based control architecture, to formulate an adaptive controller.
Within this framework, we introduce a Sim-to-Real transfer strategy comprising the following components: a bio-inspired experience replay mechanism, an enhanced domain randomisation technique, and an evaluation protocol executed on a physical platform.
Our experimental assessments demonstrate that this method effectively learns proficient policies from suboptimal simulated models of the AUV, resulting in control performance 3 times higher when transferred to a real-world vehicle.
arXiv Detail & Related papers (2023-10-17T08:46:56Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Fully Decentralized Model-based Policy Optimization for Networked
Systems [23.46407780093797]
This work aims to improve data efficiency of multi-agent control by model-based learning.
We consider networked systems where agents are cooperative and communicate only locally with their neighbors.
In our method, each agent learns a dynamic model to predict future states and broadcast their predictions by communication, and then the policies are trained under the model rollouts.
arXiv Detail & Related papers (2022-07-13T23:52:14Z) - Controllable Dynamic Multi-Task Architectures [92.74372912009127]
We propose a controllable multi-task network that dynamically adjusts its architecture and weights to match the desired task preference as well as the resource constraints.
We propose a disentangled training of two hypernetworks, by exploiting task affinity and a novel branching regularized loss, to take input preferences and accordingly predict tree-structured models with adapted weights.
arXiv Detail & Related papers (2022-03-28T17:56:40Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Autoregressive Dynamics Models for Offline Policy Evaluation and
Optimization [60.73540999409032]
We show that expressive autoregressive dynamics models generate different dimensions of the next state and reward sequentially conditioned on previous dimensions.
We also show that autoregressive dynamics models are useful for offline policy optimization by serving as a way to enrich the replay buffer.
arXiv Detail & Related papers (2021-04-28T16:48:44Z) - Meta-Reinforcement Learning for Adaptive Motor Control in Changing Robot
Dynamics and Environments [3.5309638744466167]
This work developed a meta-learning approach that adapts the control policy on the fly to different changing conditions for robust locomotion.
The proposed method constantly updates the interaction model, samples feasible sequences of actions of estimated the state-action trajectories, and then applies the optimal actions to maximize the reward.
arXiv Detail & Related papers (2021-01-19T12:57:12Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.