Related papers: Goal Recognition as Reinforcement Learning

Goal Recognition as Reinforcement Learning

URL: http://arxiv.org/abs/2202.06356v1
Date: Sun, 13 Feb 2022 16:16:43 GMT
Title: Goal Recognition as Reinforcement Learning
Authors: Leonardo Rosa Amado and Reuth Mirsky and Felipe Meneguzzi
Abstract summary: We develop a framework that combines model-free reinforcement learning and goal recognition. This framework consists of two main stages: Offline learning of policies or utility functions for each potential goal, and online inference. The resulting instantiation achieves state-of-the-art performance against goal recognizers on standard evaluation domains and superior performance in noisy environments.
Score: 20.651718821998106
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most approaches for goal recognition rely on specifications of the possible dynamics of the actor in the environment when pursuing a goal. These specifications suffer from two key issues. First, encoding these dynamics requires careful design by a domain expert, which is often not robust to noise at recognition time. Second, existing approaches often need costly real-time computations to reason about the likelihood of each potential goal. In this paper, we develop a framework that combines model-free reinforcement learning and goal recognition to alleviate the need for careful, manual domain design, and the need for costly online executions. This framework consists of two main stages: Offline learning of policies or utility functions for each potential goal, and online inference. We provide a first instance of this framework using tabular Q-learning for the learning stage, as well as three measures that can be used to perform the inference stage. The resulting instantiation achieves state-of-the-art performance against goal recognizers on standard evaluation domains and superior performance in noisy environments.

Related papers

Goal Recognition using Actor-Critic Optimization [12.842382984993632]
Deep Recognition using Actor-Critic Optimization (DRACO) is a novel approach based on deep reinforcement learning. DRACO is the first goal recognition algorithm that learns a set of policy networks from unstructured data and uses them for inference. It achieves state-of-the-art performance for goal recognition in discrete settings while not using the structured inputs used by existing approaches.
arXiv Detail & Related papers (2024-12-31T16:44:20Z)
Rethinking Multi-Objective Learning through Goal-Conditioned Supervised Learning [8.593384839118658]
Multi-objective learning aims to optimize multiple objectives simultaneously with a single model. It suffers from the difficulty to formalize and conduct the exact learning process. We propose a general framework for automatically learning to achieve multiple objectives based on the existing sequential data.
arXiv Detail & Related papers (2024-12-12T03:47:40Z)
Real-time goal recognition using approximations in Euclidean space [10.003540430416091]
We develop an efficient method for goal recognition that relies either on a single call to the planner for each possible goal in discrete domains or a simplified motion model that reduces the computational burden in continuous ones. The resulting approach performs the online component of recognition orders of magnitude faster than the current state of the art, making it the first online method effectively usable for robotics applications that require sub-second recognition.
arXiv Detail & Related papers (2023-07-15T19:27:38Z)
Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping [94.89128390954572]
We propose a novel self-supervised learning phase on the pre-collected dataset to understand the structure and the dynamics of the model. We evaluate our method on three continuous control tasks, and show that our model significantly outperforms existing approaches.
arXiv Detail & Related papers (2023-01-05T15:07:10Z)
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups. We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z)
Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation. We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional. We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z)
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command. We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z)
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks [133.40619754674066]
Goal-conditioned reinforcement learning can solve tasks in a wide range of domains, including navigation and manipulation. We propose the distant goal-reaching task by using search at training time to automatically generate intermediate states. E-step corresponds to planning an optimal sequence of waypoints using graph search, while the M-step aims to learn a goal-conditioned policy to reach those waypoints.
arXiv Detail & Related papers (2021-10-22T22:05:31Z)
Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors. In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency. We propose setting up an automatic curriculum for goals that the agent needs to solve. We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
Universal Value Density Estimation for Imitation Learning and Goal-Conditioned Reinforcement Learning [5.406386303264086]
In either case, effective solutions require the agent to reliably reach a specified state. This work introduces an approach which utilizes recent advances in density estimation to effectively learn to reach a given state. As our first contribution, we use this approach for goal-conditioned reinforcement learning and show that it is both efficient and does not suffer from hindsight bias in domains. As our second contribution, we extend the approach to imitation learning and show that it achieves state-of-the art demonstration sample-efficiency on standard benchmark tasks.
arXiv Detail & Related papers (2020-02-15T23:46:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.