Related papers: An Adversarial Objective for Scalable Exploration

An Adversarial Objective for Scalable Exploration

URL: http://arxiv.org/abs/2003.06082v4
Date: Wed, 11 Nov 2020 18:39:43 GMT
Title: An Adversarial Objective for Scalable Exploration
Authors: Bernadette Bucher, Karl Schmeckpeper, Nikolai Matni, Kostas Daniilidis
Abstract summary: Model-based curiosity combines active learning approaches to optimal sampling with the information gain based incentives for exploration. Existing model-based curiosity methods look to approximate prediction uncertainty with approaches which struggle to scale to many prediction-planning pipelines. We address these scalability issues with an adversarial curiosity method minimizing a score given by a discriminator network.
Score: 39.482557864395005
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model-based curiosity combines active learning approaches to optimal sampling with the information gain based incentives for exploration presented in the curiosity literature. Existing model-based curiosity methods look to approximate prediction uncertainty with approaches which struggle to scale to many prediction-planning pipelines used in robotics tasks. We address these scalability issues with an adversarial curiosity method minimizing a score given by a discriminator network. This discriminator is optimized jointly with a prediction model and enables our active learning approach to sample sequences of observations and actions which result in predictions considered the least realistic by the discriminator. We demonstrate progressively increasing advantages as compute is restricted of our adversarial curiosity approach over leading model-based exploration strategies in simulated environments. We further demonstrate the ability of our adversarial curiosity method to scale to a robotic manipulation prediction-planning pipeline where we improve sample efficiency and prediction performance for a domain transfer problem.

Related papers

Towards Locally Explaining Prediction Behavior via Gradual Interventions and Measuring Property Gradients [9.961090778082285]
Deep learning models achieve high predictive performance but lack intrinsic interpretability. We introduce a novel framework for local interventional explanations by leveraging recent advances in image-to-image editing models. Our approach performs gradual interventions on semantic properties to quantify the corresponding impact on a model's predictions.
arXiv Detail & Related papers (2025-03-07T13:50:37Z)
Microfoundation Inference for Strategic Prediction [26.277259491014163]
We propose a methodology for learning the distribution map that encapsulates the long-term impacts of predictive models on the population. Specifically, we model agents' responses as a cost-utility problem and propose estimates for said cost. We provide a rate of convergence for this proposed estimate and assess its quality through empirical demonstrations on a credit-scoring dataset.
arXiv Detail & Related papers (2024-11-13T19:37:49Z)
Motion Forecasting via Model-Based Risk Minimization [8.766024024417316]
We propose a novel sampling method applicable to trajectory prediction based on the predictions of multiple models. We first show that conventional sampling based on predicted probabilities can degrade performance due to missing alignment between models. By using state-of-the-art models as base learners, our approach constructs diverse and effective ensembles for optimal trajectory sampling.
arXiv Detail & Related papers (2024-09-16T09:03:28Z)
Model-Free Active Exploration in Reinforcement Learning [53.786439742572995]
We study the problem of exploration in Reinforcement Learning and present a novel model-free solution. Our strategy is able to identify efficient policies faster than state-of-the-art exploration approaches.
arXiv Detail & Related papers (2024-06-30T19:00:49Z)
Automated Deception Detection from Videos: Using End-to-End Learning Based High-Level Features and Classification Approaches [0.0]
We propose a multimodal approach combining deep learning and discriminative models for deception detection. We employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions. Our approach is evaluated on five datasets, including a new Rolling-Dice Experiment motivated by economic factors.
arXiv Detail & Related papers (2023-07-13T08:45:15Z)
Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders. Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector. We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z)
HYPER: Learned Hybrid Trajectory Prediction via Factored Inference and Adaptive Sampling [27.194900145235007]
We introduce HYPER, a general and expressive hybrid prediction framework. By modeling traffic agents as a hybrid discrete-continuous system, our approach is capable of predicting discrete intent changes over time. We train and validate our model on the Argoverse dataset, and demonstrate its effectiveness through comprehensive ablation studies and comparisons with state-of-the-art models.
arXiv Detail & Related papers (2021-10-05T20:20:10Z)
Deceptive Decision-Making Under Uncertainty [25.197098169762356]
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks. By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals. We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies.
arXiv Detail & Related papers (2021-09-14T14:56:23Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Probabilistic Human Motion Prediction via A Bayesian Neural Network [71.16277790708529]
We propose a probabilistic model for human motion prediction in this paper. Our model could generate several future motions when given an observed motion sequence. We extensively validate our approach on a large scale benchmark dataset Human3.6m.
arXiv Detail & Related papers (2021-07-14T09:05:33Z)
Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data. We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.