GalilAI: Out-of-Task Distribution Detection using Causal Active
Experimentation for Safe Transfer RL
- URL: http://arxiv.org/abs/2110.15489v1
- Date: Fri, 29 Oct 2021 01:45:56 GMT
- Title: GalilAI: Out-of-Task Distribution Detection using Causal Active
Experimentation for Safe Transfer RL
- Authors: Sumedh A Sontakke, Stephen Iota, Zizhao Hu, Arash Mehrjou, Laurent
Itti, Bernhard Sch\"olkopf
- Abstract summary: Out-of-distribution (OOD) detection is a well-studied topic in supervised learning.
We propose a novel task: that of Out-of-Task Distribution (OOTD) detection.
We name our method GalilAI, in honor of Galileo Galilei, as it discovers, among other causal processes, that gravitational acceleration is independent of the mass of a body.
- Score: 11.058960131490903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Out-of-distribution (OOD) detection is a well-studied topic in supervised
learning. Extending the successes in supervised learning methods to the
reinforcement learning (RL) setting, however, is difficult due to the data
generating process - RL agents actively query their environment for data, and
the data are a function of the policy followed by the agent. An agent could
thus neglect a shift in the environment if its policy did not lead it to
explore the aspect of the environment that shifted. Therefore, to achieve safe
and robust generalization in RL, there exists an unmet need for OOD detection
through active experimentation. Here, we attempt to bridge this lacuna by first
defining a causal framework for OOD scenarios or environments encountered by RL
agents in the wild. Then, we propose a novel task: that of Out-of-Task
Distribution (OOTD) detection. We introduce an RL agent that actively
experiments in a test environment and subsequently concludes whether it is OOTD
or not. We name our method GalilAI, in honor of Galileo Galilei, as it
discovers, among other causal processes, that gravitational acceleration is
independent of the mass of a body. Finally, we propose a simple probabilistic
neural network baseline for comparison, which extends extant Model-Based RL. We
find that GalilAI outperforms the baseline significantly. See visualizations of
our method https://galil-ai.github.io/
Related papers
- Knowledge Graph Reasoning with Self-supervised Reinforcement Learning [30.359557545737747]
We propose a self-supervised pre-training method to warm up the policy network before the RL training stage.
In our supervised learning stage, the agent selects actions based on the policy network and learns from generated labels.
We show that our SSRL model meets or exceeds current state-of-the-art results on all Hits@k and mean reciprocal rank (MRR) metrics.
arXiv Detail & Related papers (2024-05-22T13:39:33Z) - Leveraging Reward Consistency for Interpretable Feature Discovery in
Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents.
We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents.
We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z) - Testing of Deep Reinforcement Learning Agents with Surrogate Models [10.243488468625786]
Deep Reinforcement Learning (DRL) has received a lot of attention from the research community in recent years.
In this paper, we propose a search-based approach to test such agents.
arXiv Detail & Related papers (2023-05-22T06:21:39Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - CostNet: An End-to-End Framework for Goal-Directed Reinforcement
Learning [9.432068833600884]
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment.
There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines.
This paper introduces a novel reinforcement learning algorithm for predicting the distance between two states in a Markov Decision Process.
arXiv Detail & Related papers (2022-10-03T21:16:14Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - A Validation Tool for Designing Reinforcement Learning Environments [0.0]
This study proposes a Markov-based feature analysis method to validate whether an MDP is well formulated.
We believe an MDP suitable for applying RL should contain a set of state features that are both sensitive to actions and predictive in rewards.
arXiv Detail & Related papers (2021-12-10T13:28:08Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.