Related papers: Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models

URL: http://arxiv.org/abs/2112.01163v1
Date: Thu, 2 Dec 2021 12:15:25 GMT
Title: Robust Robotic Control from Pixels using Contrastive Recurrent State-Space Models
Authors: Nitish Srivastava, Walter Talbott, Martin Bertran Lopez, Shuangfei Zhai, Josh Susskind
Abstract summary: We study how to learn world models in unconstrained environments over high-dimensional observation spaces such as images. One source of difficulty is the presence of irrelevant but hard-to-model background distractions. We learn a recurrent latent dynamics model which contrastively predicts the next observation. This simple model leads to surprisingly robust robotic control even with simultaneous camera, background, and color distractions.
Score: 8.22669535053079
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modeling the world can benefit robot learning by providing a rich training signal for shaping an agent's latent state space. However, learning world models in unconstrained environments over high-dimensional observation spaces such as images is challenging. One source of difficulty is the presence of irrelevant but hard-to-model background distractions, and unimportant visual details of task-relevant entities. We address this issue by learning a recurrent latent dynamics model which contrastively predicts the next observation. This simple model leads to surprisingly robust robotic control even with simultaneous camera, background, and color distractions. We outperform alternatives such as bisimulation methods which impose state-similarity measures derived from divergence in future reward or future optimal actions. We obtain state-of-the-art results on the Distracting Control Suite, a challenging benchmark for pixel-based robotic control.

Related papers

Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks. We introduce a generative framework leveraging flow matching for online robot dynamics model alignment. We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z)
A Real-time Anomaly Detection Method for Robots based on a Flexible and Sparse Latent Space [2.0186752447895993]
Deep learning-based models in robotics face challenges due to limited training data and highly noisy signal features. We present Sparse Masked Autoregressive Flow-based Adversarial AutoEncoders model to address these problems. Our model performs inferences within 1 millisecond, ensuring real-time anomaly detection.
arXiv Detail & Related papers (2025-04-15T13:17:14Z)
Spatially Visual Perception for End-to-End Robotic Learning [33.490603706207075]
We introduce a video-based spatial perception framework that leverages 3D spatial representations to address environmental variability. Our approach integrates a novel image augmentation technique, AugBlender, with a state-of-the-art monocular depth estimation model trained on internet-scale data.
arXiv Detail & Related papers (2024-11-26T14:23:42Z)
Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models. Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning. Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z)
Model-Based Reinforcement Learning with Isolated Imaginations [61.67183143982074]
We propose Iso-Dream++, a model-based reinforcement learning approach. We perform policy optimization based on the decoupled latent imaginations. This enables long-horizon visuomotor control tasks to benefit from isolating mixed dynamics sources in the wild.
arXiv Detail & Related papers (2023-03-27T02:55:56Z)
Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot. We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z)
Masked World Models for Visual Control [90.13638482124567]
We introduce a visual model-based RL framework that decouples visual representation learning and dynamics learning. We demonstrate that our approach achieves state-of-the-art performance on a variety of visual robotic tasks.
arXiv Detail & Related papers (2022-06-28T18:42:27Z)
Learning Visible Connectivity Dynamics for Cloth Smoothing [17.24004979796887]
We propose to learn a particle-based dynamics model from a partial point cloud observation. To overcome the challenges of partial observability, we infer which visible points are connected on the underlying cloth mesh. We show that our method greatly outperforms previous state-of-the-art model-based and model-free reinforcement learning methods in simulation.
arXiv Detail & Related papers (2021-05-21T15:03:29Z)
Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z)
Model-Based Visual Planning with Self-Supervised Functional Distances [104.83979811803466]
We present a self-supervised method for model-based visual goal reaching. Our approach learns entirely using offline, unlabeled data. We find that this approach substantially outperforms both model-free and model-based prior methods.
arXiv Detail & Related papers (2020-12-30T23:59:09Z)
CLOUD: Contrastive Learning of Unsupervised Dynamics [19.091886595825947]
We propose to learn forward and inverse dynamics in a fully unsupervised manner via contrastive estimation. We demonstrate the efficacy of our approach across a variety of tasks including goal-directed planning and imitation from observations.
arXiv Detail & Related papers (2020-10-23T15:42:57Z)
Counterfactual Explanation and Causal Inference in Service of Robustness in Robot Control [15.104159722499366]
We propose an architecture for training generative models of counterfactual conditionals of the form, 'can we modify event A to cause B instead of C?' In contrast to conventional control design approaches, where robustness is quantified in terms of the ability to reject noise, we explore the space of counterfactuals that might cause a certain requirement to be violated.
arXiv Detail & Related papers (2020-09-18T14:22:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.