Related papers: Domain Curiosity: Learning Efficient Data Collection Strategies for Domain Adaptation

Domain Curiosity: Learning Efficient Data Collection Strategies for Domain Adaptation

URL: http://arxiv.org/abs/2103.07223v1
Date: Fri, 12 Mar 2021 12:02:11 GMT
Title: Domain Curiosity: Learning Efficient Data Collection Strategies for Domain Adaptation
Authors: Karol Arndt, Oliver Struckmeier, Ville Kyrki
Abstract summary: We present domain curiosity -- a method of training exploratory policies that are explicitly optimized to provide data. In contrast to most curiosity methods, our approach explicitly rewards learning, which makes it robust to environment noise. We evaluate the proposed method by comparing how much a model can learn about environment dynamics given data collected by the proposed approach.
Score: 16.539422751949797
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Domain adaptation is a common problem in robotics, with applications such as transferring policies from simulation to real world and lifelong learning. Performing such adaptation, however, requires informative data about the environment to be available during the adaptation. In this paper, we present domain curiosity -- a method of training exploratory policies that are explicitly optimized to provide data that allows a model to learn about the unknown aspects of the environment. In contrast to most curiosity methods, our approach explicitly rewards learning, which makes it robust to environment noise without sacrificing its ability to learn. We evaluate the proposed method by comparing how much a model can learn about environment dynamics given data collected by the proposed approach, compared to standard curious and random policies. The evaluation is performed using a toy environment, two simulated robot setups, and on a real-world haptic exploration task. The results show that the proposed method allows data-efficient and accurate estimation of dynamics.

Related papers

Training a Generally Curious Agent [86.84089201249104]
Paprika is a fine-tuning approach that enables language models to develop general decision-making capabilities.<n>Paprika teaches models to explore and adapt their behavior on a new task based on environment feedback in-context without more gradient updates.<n>Results suggest a promising path towards AI systems that can autonomously solve sequential decision-making problems.
arXiv Detail & Related papers (2025-02-24T18:56:58Z)
Evaluating Real-World Robot Manipulation Policies in Simulation [91.55267186958892]
Control and visual disparities between real and simulated environments are key challenges for reliable simulated evaluation. We propose approaches for mitigating these gaps without needing to craft full-fidelity digital twins of real-world environments. We create SIMPLER, a collection of simulated environments for manipulation policy evaluation on common real robot setups.
arXiv Detail & Related papers (2024-05-09T17:30:16Z)
One-Shot Domain Adaptive and Generalizable Semantic Segmentation with Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data. Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation. We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z)
Exploration via Planning for Information about the Optimal Trajectory [67.33886176127578]
We develop a method that allows us to plan for exploration while taking the task and the current knowledge into account. We demonstrate that our method learns strong policies with 2x fewer samples than strong exploration baselines.
arXiv Detail & Related papers (2022-10-06T20:28:55Z)
Dream to Explore: Adaptive Simulations for Autonomous Systems [3.0664963196464448]
We tackle the problem of learning to control dynamical systems by applying Bayesian nonparametric methods. By employing Gaussian processes to discover latent world dynamics, we mitigate common data efficiency issues observed in reinforcement learning. Our algorithm jointly learns a world model and policy by optimizing a variational lower bound of a log-likelihood.
arXiv Detail & Related papers (2021-10-27T04:27:28Z)
Zero-Shot Reinforcement Learning on Graphs for Autonomous Exploration Under Uncertainty [6.42522897323111]
We present a framework for self-learning a high-performance exploration policy in a single simulation environment. We propose a novel approach that uses graph neural networks in conjunction with deep reinforcement learning.
arXiv Detail & Related papers (2021-05-11T02:42:17Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Learning to Continuously Optimize Wireless Resource In Episodically Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment. We propose to build the notion of continual learning into the modeling process of learning wireless systems. Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z)
Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state. reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle. In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
Sim-to-Real Transfer with Incremental Environment Complexity for Reinforcement Learning of Depth-Based Robot Navigation [1.290382979353427]
Soft-Actor Critic (SAC) training strategy using incremental environment complexity is proposed to drastically reduce the need for additional training in the real world. The application addressed is depth-based mapless navigation, where a mobile robot should reach a given waypoint in a cluttered environment with no prior mapping information.
arXiv Detail & Related papers (2020-04-30T10:47:02Z)
Counter-example Guided Learning of Bounds on Environment Behavior [11.357397596759172]
We present a data-driven solution that allows for a system to be evaluated for specification conformance without an accurate model of the environment. Our approach involves learning a conservative reactive bound of the environment's behavior using data and specification of the system's desired behavior.
arXiv Detail & Related papers (2020-01-20T19:58:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.