Stronger Generalization Guarantees for Robot Learning by Combining
Generative Models and Real-World Data
- URL: http://arxiv.org/abs/2111.08761v1
- Date: Tue, 16 Nov 2021 20:13:10 GMT
- Title: Stronger Generalization Guarantees for Robot Learning by Combining
Generative Models and Real-World Data
- Authors: Abhinav Agarwal, Sushant Veer, Allen Z. Ren, Anirudha Majumdar
- Abstract summary: We provide a framework for providing generalization guarantees by leveraging a finite dataset of real-world environments.
We demonstrate our approach on two simulated systems with nonlinear/hybrid dynamics and rich sensing modalities.
- Score: 5.935761705025763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We are motivated by the problem of learning policies for robotic systems with
rich sensory inputs (e.g., vision) in a manner that allows us to guarantee
generalization to environments unseen during training. We provide a framework
for providing such generalization guarantees by leveraging a finite dataset of
real-world environments in combination with a (potentially inaccurate)
generative model of environments. The key idea behind our approach is to
utilize the generative model in order to implicitly specify a prior over
policies. This prior is updated using the real-world dataset of environments by
minimizing an upper bound on the expected cost across novel environments
derived via Probably Approximately Correct (PAC)-Bayes generalization theory.
We demonstrate our approach on two simulated systems with nonlinear/hybrid
dynamics and rich sensing modalities: (i) quadrotor navigation with an onboard
vision sensor, and (ii) grasping objects using a depth sensor. Comparisons with
prior work demonstrate the ability of our approach to obtain stronger
generalization guarantees by utilizing generative models. We also present
hardware experiments for validating our bounds for the grasping task.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic
Gaussian Mixture Models [21.13906762261418]
A long-standing challenge for a robotic manipulation system is adapting and generalizing its acquired motor skills to unseen environments.
We tackle this challenge employing hybrid skill models that integrate imitation and reinforcement paradigms.
We show that our method enables a robot to gain a significant zero-shot generalization to novel environments and to refine skills in the target environments faster than learning from scratch.
arXiv Detail & Related papers (2023-10-23T16:03:23Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.
Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.
Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - CoDEPS: Online Continual Learning for Depth Estimation and Panoptic
Segmentation [28.782231314289174]
We introduce continual learning for deep learning-based monocular depth estimation and panoptic segmentation in new environments in an online manner.
We propose a novel domain-mixing strategy to generate pseudo-labels to adapt panoptic segmentation.
We explicitly address the limited storage capacity of robotic systems by leveraging sampling strategies for constructing a fixed-size replay buffer.
arXiv Detail & Related papers (2023-03-17T17:31:55Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Dream to Explore: Adaptive Simulations for Autonomous Systems [3.0664963196464448]
We tackle the problem of learning to control dynamical systems by applying Bayesian nonparametric methods.
By employing Gaussian processes to discover latent world dynamics, we mitigate common data efficiency issues observed in reinforcement learning.
Our algorithm jointly learns a world model and policy by optimizing a variational lower bound of a log-likelihood.
arXiv Detail & Related papers (2021-10-27T04:27:28Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.