Goal Recognition as Reinforcement Learning
- URL: http://arxiv.org/abs/2202.06356v1
- Date: Sun, 13 Feb 2022 16:16:43 GMT
- Title: Goal Recognition as Reinforcement Learning
- Authors: Leonardo Rosa Amado and Reuth Mirsky and Felipe Meneguzzi
- Abstract summary: We develop a framework that combines model-free reinforcement learning and goal recognition.
This framework consists of two main stages: Offline learning of policies or utility functions for each potential goal, and online inference.
The resulting instantiation achieves state-of-the-art performance against goal recognizers on standard evaluation domains and superior performance in noisy environments.
- Score: 20.651718821998106
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most approaches for goal recognition rely on specifications of the possible
dynamics of the actor in the environment when pursuing a goal. These
specifications suffer from two key issues. First, encoding these dynamics
requires careful design by a domain expert, which is often not robust to noise
at recognition time. Second, existing approaches often need costly real-time
computations to reason about the likelihood of each potential goal. In this
paper, we develop a framework that combines model-free reinforcement learning
and goal recognition to alleviate the need for careful, manual domain design,
and the need for costly online executions. This framework consists of two main
stages: Offline learning of policies or utility functions for each potential
goal, and online inference. We provide a first instance of this framework using
tabular Q-learning for the learning stage, as well as three measures that can
be used to perform the inference stage. The resulting instantiation achieves
state-of-the-art performance against goal recognizers on standard evaluation
domains and superior performance in noisy environments.
Related papers
- Real-time goal recognition using approximations in Euclidean space [10.003540430416091]
We develop an efficient method for goal recognition that relies either on a single call to the planner for each possible goal in discrete domains or a simplified motion model that reduces the computational burden in continuous ones.
The resulting approach performs the online component of recognition orders of magnitude faster than the current state of the art, making it the first online method effectively usable for robotics applications that require sub-second recognition.
arXiv Detail & Related papers (2023-07-15T19:27:38Z) - Learning Goal-Conditioned Policies Offline with Self-Supervised Reward
Shaping [94.89128390954572]
We propose a novel self-supervised learning phase on the pre-collected dataset to understand the structure and the dynamics of the model.
We evaluate our method on three continuous control tasks, and show that our model significantly outperforms existing approaches.
arXiv Detail & Related papers (2023-01-05T15:07:10Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation.
We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional.
We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation.
We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states.
E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Universal Value Density Estimation for Imitation Learning and
Goal-Conditioned Reinforcement Learning [5.406386303264086]
In either case, effective solutions require the agent to reliably reach a specified state.
This work introduces an approach which utilizes recent advances in density estimation to effectively learn to reach a given state.
As our first contribution, we use this approach for goal-conditioned reinforcement learning and show that it is both efficient and does not suffer from hindsight bias in domains.
As our second contribution, we extend the approach to imitation learning and show that it achieves state-of-the art demonstration sample-efficiency on standard benchmark tasks.
arXiv Detail & Related papers (2020-02-15T23:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.