An Adversarial Objective for Scalable Exploration
- URL: http://arxiv.org/abs/2003.06082v4
- Date: Wed, 11 Nov 2020 18:39:43 GMT
- Title: An Adversarial Objective for Scalable Exploration
- Authors: Bernadette Bucher, Karl Schmeckpeper, Nikolai Matni, Kostas Daniilidis
- Abstract summary: Model-based curiosity combines active learning approaches to optimal sampling with the information gain based incentives for exploration.
Existing model-based curiosity methods look to approximate prediction uncertainty with approaches which struggle to scale to many prediction-planning pipelines.
We address these scalability issues with an adversarial curiosity method minimizing a score given by a discriminator network.
- Score: 39.482557864395005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-based curiosity combines active learning approaches to optimal sampling
with the information gain based incentives for exploration presented in the
curiosity literature. Existing model-based curiosity methods look to
approximate prediction uncertainty with approaches which struggle to scale to
many prediction-planning pipelines used in robotics tasks. We address these
scalability issues with an adversarial curiosity method minimizing a score
given by a discriminator network. This discriminator is optimized jointly with
a prediction model and enables our active learning approach to sample sequences
of observations and actions which result in predictions considered the least
realistic by the discriminator. We demonstrate progressively increasing
advantages as compute is restricted of our adversarial curiosity approach over
leading model-based exploration strategies in simulated environments. We
further demonstrate the ability of our adversarial curiosity method to scale to
a robotic manipulation prediction-planning pipeline where we improve sample
efficiency and prediction performance for a domain transfer problem.
Related papers
- Motion Forecasting via Model-Based Risk Minimization [8.766024024417316]
We propose a novel sampling method applicable to trajectory prediction based on the predictions of multiple models.
We first show that conventional sampling based on predicted probabilities can degrade performance due to missing alignment between models.
By using state-of-the-art models as base learners, our approach constructs diverse and effective ensembles for optimal trajectory sampling.
arXiv Detail & Related papers (2024-09-16T09:03:28Z) - Model-Free Active Exploration in Reinforcement Learning [53.786439742572995]
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution.
Our strategy is able to identify efficient policies faster than state-of-the-art exploration approaches.
arXiv Detail & Related papers (2024-06-30T19:00:49Z) - Uncovering the human motion pattern: Pattern Memory-based Diffusion
Model for Trajectory Prediction [45.77348842004666]
Motion Pattern Priors Memory Network is a memory-based method to uncover latent motion patterns in human behavior.
We introduce an addressing mechanism to retrieve the matched pattern and the potential target distributions for each prediction from the memory bank.
Experiments validate the effectiveness of our approach, achieving state-of-the-art trajectory prediction accuracy.
arXiv Detail & Related papers (2024-01-05T17:39:52Z) - Automated Deception Detection from Videos: Using End-to-End Learning
Based High-Level Features and Classification Approaches [0.0]
We propose a multimodal approach combining deep learning and discriminative models for deception detection.
We employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions.
Our approach is evaluated on five datasets, including a new Rolling-Dice Experiment motivated by economic factors.
arXiv Detail & Related papers (2023-07-13T08:45:15Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - HYPER: Learned Hybrid Trajectory Prediction via Factored Inference and
Adaptive Sampling [27.194900145235007]
We introduce HYPER, a general and expressive hybrid prediction framework.
By modeling traffic agents as a hybrid discrete-continuous system, our approach is capable of predicting discrete intent changes over time.
We train and validate our model on the Argoverse dataset, and demonstrate its effectiveness through comprehensive ablation studies and comparisons with state-of-the-art models.
arXiv Detail & Related papers (2021-10-05T20:20:10Z) - Deceptive Decision-Making Under Uncertainty [25.197098169762356]
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks.
By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals.
We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies.
arXiv Detail & Related papers (2021-09-14T14:56:23Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Probabilistic Human Motion Prediction via A Bayesian Neural Network [71.16277790708529]
We propose a probabilistic model for human motion prediction in this paper.
Our model could generate several future motions when given an observed motion sequence.
We extensively validate our approach on a large scale benchmark dataset Human3.6m.
arXiv Detail & Related papers (2021-07-14T09:05:33Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.