Specification-Guided Data Aggregation for Semantically Aware Imitation
Learning
- URL: http://arxiv.org/abs/2303.17010v1
- Date: Wed, 29 Mar 2023 20:29:26 GMT
- Title: Specification-Guided Data Aggregation for Semantically Aware Imitation
Learning
- Authors: Ameesh Shah, Jonathan DeCastro, John Gideon, Beyazit Yalcinkaya, Guy
Rosman, Sanjit A. Seshia
- Abstract summary: We introduce a novel method for improving imitation-learned models in a semantically aware fashion.
We create a set of formal specifications as a means of partitioning the space of possible environments into semantically similar regions.
We then aggregate expert data on environments in these identified regions, leading to more accurate imitation of the expert's behavior semantics.
- Score: 11.104747861491703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advancements in simulation and formal methods-guided environment sampling
have enabled the rigorous evaluation of machine learning models in a number of
safety-critical scenarios, such as autonomous driving. Application of these
environment sampling techniques towards improving the learned models themselves
has yet to be fully exploited. In this work, we introduce a novel method for
improving imitation-learned models in a semantically aware fashion by
leveraging specification-guided sampling techniques as a means of aggregating
expert data in new environments. Specifically, we create a set of formal
specifications as a means of partitioning the space of possible environments
into semantically similar regions, and identify elements of this partition
where our learned imitation behaves most differently from the expert. We then
aggregate expert data on environments in these identified regions, leading to
more accurate imitation of the expert's behavior semantics. We instantiate our
approach in a series of experiments in the CARLA driving simulator, and
demonstrate that our approach leads to models that are more accurate than those
learned with other environment sampling methods.
Related papers
- Building surrogate models using trajectories of agents trained by Reinforcement Learning [34.57352474501273]
We propose a novel method to efficiently sample simulated deterministic environments by using policies trained by Reinforcement Learning.<n>We provide an extensive analysis of these surrogate-building strategies with respect to Latin-Hypercube sampling or Active Learning and Kriging.<n>We conclude that the proposed method improves the state-of-the-art and clears the path to enable the application of surrogate-aided Reinforcement Learning policy optimization strategies on complex simulators.
arXiv Detail & Related papers (2025-09-01T09:13:45Z) - Poisson Hierarchical Indian Buffet Processes for Within and Across Group Sharing of Latent Features-With Indications for Microbiome Species Sampling Models [10.189801224583679]
We present a comprehensive Bayesian posterior analysis of Poisson Hierarchical Indian Buffet Processes.
This analysis covers a potentially infinite number of species and unknown parameters.
We aim to express our findings in a language that resonates with experts in microbiome and ecological studies.
arXiv Detail & Related papers (2025-02-04T01:27:16Z) - Generate to Discriminate: Expert Routing for Continual Learning [59.71853576559306]
Generate to Discriminate (G2D) is a continual learning method that leverages synthetic data to train a domain-discriminator.
We observe that G2D outperforms competitive domain-incremental learning methods on tasks in both vision and language modalities.
arXiv Detail & Related papers (2024-12-22T13:16:28Z) - Supervised Fine-Tuning as Inverse Reinforcement Learning [8.044033685073003]
The prevailing approach to aligning Large Language Models (LLMs) typically relies on human or AI feedback.
In our work, we question the efficacy of such datasets and explore various scenarios where alignment with expert demonstrations proves more realistic.
arXiv Detail & Related papers (2024-03-18T17:52:57Z) - Learning minimal representations of stochastic processes with
variational autoencoders [52.99137594502433]
We introduce an unsupervised machine learning approach to determine the minimal set of parameters required to describe a process.
Our approach enables for the autonomous discovery of unknown parameters describing processes.
arXiv Detail & Related papers (2023-07-21T14:25:06Z) - Inverse Dynamics Pretraining Learns Good Representations for Multitask
Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning.
We consider a setting where the pretraining corpus consists of multitask demonstrations.
We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z) - HaDR: Applying Domain Randomization for Generating Synthetic Multimodal
Dataset for Hand Instance Segmentation in Cluttered Industrial Environments [0.0]
This study uses domain randomization to generate a synthetic RGB-D dataset for training multimodal instance segmentation models.
We show that our approach enables the models to outperform corresponding models trained on existing state-of-the-art datasets.
arXiv Detail & Related papers (2023-04-12T13:02:08Z) - Model-Based Deep Learning: On the Intersection of Deep Learning and
Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications.
Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular.
Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z) - Robust Learning from Observation with Model Misspecification [33.92371002674386]
Imitation learning (IL) is a popular paradigm for training policies in robotic systems.
We propose a robust IL algorithm to learn policies that can effectively transfer to the real environment without fine-tuning.
arXiv Detail & Related papers (2022-02-12T07:04:06Z) - Domain Curiosity: Learning Efficient Data Collection Strategies for
Domain Adaptation [16.539422751949797]
We present domain curiosity -- a method of training exploratory policies that are explicitly optimized to provide data.
In contrast to most curiosity methods, our approach explicitly rewards learning, which makes it robust to environment noise.
We evaluate the proposed method by comparing how much a model can learn about environment dynamics given data collected by the proposed approach.
arXiv Detail & Related papers (2021-03-12T12:02:11Z) - A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world.
We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms.
Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z) - MeLIME: Meaningful Local Explanation for Machine Learning Models [2.819725769698229]
We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models.
MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models.
arXiv Detail & Related papers (2020-09-12T16:06:58Z) - Evaluating the Disentanglement of Deep Generative Models through
Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model.
We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.