Related papers: GRAIL: Goal Recognition Alignment through Imitation Learning

GRAIL: Goal Recognition Alignment through Imitation Learning

URL: http://arxiv.org/abs/2602.14252v1
Date: Sun, 15 Feb 2026 17:45:03 GMT
Title: GRAIL: Goal Recognition Alignment through Imitation Learning
Authors: Osher Elhadad, Felipe Meneguzzi, Reuth Mirsky,
Abstract summary: This paper introduces Goal Recognition Alignment through Imitation Learning (GRAIL)<n>GRAIL learns one goal-directed policy for each candidate goal directly from (potentially suboptimal) demonstration trajectories.<n>It increases the F1-score by more than 0.5 under systematically biased optimal behavior, achieves gains of approximately 0.1-0.3 under suboptimal behavior, and yields improvements of up to 0.4 under noisy optimal trajectories.
Score: 10.284830265068795
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding an agent's goals from its behavior is fundamental to aligning AI systems with human intentions. Existing goal recognition methods typically rely on an optimal goal-oriented policy representation, which may differ from the actor's true behavior and hinder the accurate recognition of their goal. To address this gap, this paper introduces Goal Recognition Alignment through Imitation Learning (GRAIL), which leverages imitation learning and inverse reinforcement learning to learn one goal-directed policy for each candidate goal directly from (potentially suboptimal) demonstration trajectories. By scoring an observed partial trajectory with each learned goal-directed policy in a single forward pass, GRAIL retains the one-shot inference capability of classical goal recognition while leveraging learned policies that can capture suboptimal and systematically biased behavior. Across the evaluated domains, GRAIL increases the F1-score by more than 0.5 under systematically biased optimal behavior, achieves gains of approximately 0.1-0.3 under suboptimal behavior, and yields improvements of up to 0.4 under noisy optimal trajectories, while remaining competitive in fully optimal settings. This work contributes toward scalable and robust models for interpreting agent goals in uncertain environments.

Related papers

TEACH: Temporal Variance-Driven Curriculum for Reinforcement Learning [8.366600075241847]
We propose a novel Student-Teacher learning paradigm with a Temporal Variance-Driven Curriculum to accelerate Goal-Conditioned RL.<n>In this framework, the teacher module dynamically prioritizes goals with the highest temporal variance in the policy's confidence score.<n>We demonstrate this through evaluation across 11 diverse robotic manipulation and maze navigation tasks.
arXiv Detail & Related papers (2025-12-28T07:29:29Z)
Optimizing Latent Goal by Learning from Trajectory Preference [18.262362315783268]
We propose a framework named Preference Goal Tuning (PGT)<n>PGT allows an instruction following policy to interact with the environment to collect several trajectories.<n>We use preference learning to fine-tune the initial goal latent representation with the categorized trajectories.
arXiv Detail & Related papers (2024-12-03T03:27:48Z)
Zero-Shot Offline Imitation Learning via Optimal Transport [21.548195072895517]
Zero-shot imitation learning algorithms reproduce unseen behavior from as little as a single demonstration at test time.<n>Existing practical approaches view the expert demonstration as a sequence of goals, enabling imitation with a high-level goal selector, and a low-level goal-conditioned policy.<n>We introduce a novel method that mitigates this issue by directly optimizing the occupancy matching objective that is intrinsic to imitation learning.
arXiv Detail & Related papers (2024-10-11T12:10:51Z)
Towards Measuring Goal-Directedness in AI Systems [0.0]
A key prerequisite for AI systems pursuing unintended goals is whether they will behave in a coherent and goal-directed manner. We propose a new family of definitions of the goal-directedness of a policy that analyze whether it is well-modeled as near-optimal for many reward functions. Our contribution is a definition of goal-directedness that is simpler and more easily computable in order to approach the question of whether AI systems could pursue dangerous goals.
arXiv Detail & Related papers (2024-10-07T01:34:42Z)
Manifold-Aware Self-Training for Unsupervised Domain Adaptation on Regressing 6D Object Pose [69.14556386954325]
Domain gap between synthetic and real data in visual regression is bridged in this paper via global feature alignment and local refinement. Our method incorporates an explicit self-supervised manifold regularization, revealing consistent cumulative target dependency across domains. Learning unified implicit neural functions to estimate relative direction and distance of targets to their nearest class bins aims to refine target classification predictions.
arXiv Detail & Related papers (2023-05-18T08:42:41Z)
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups. We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z)
Goal-Conditioned Q-Learning as Knowledge Distillation [136.79415677706612]
We explore a connection between off-policy reinforcement learning in goal-conditioned settings and knowledge distillation. We empirically show that this can improve the performance of goal-conditioned off-policy reinforcement learning when the space of goals is high-dimensional. We also show that this technique can be adapted to allow for efficient learning in the case of multiple simultaneous sparse goals.
arXiv Detail & Related papers (2022-08-28T22:01:10Z)
Generative multitask learning mitigates target-causing confounding [61.21582323566118]
We propose a simple and scalable approach to causal representation learning for multitask learning. The improvement comes from mitigating unobserved confounders that cause the targets, but not the input. Our results on the Attributes of People and Taskonomy datasets reflect the conceptual improvement in robustness to prior probability shift.
arXiv Detail & Related papers (2022-02-08T20:42:14Z)
Adversarial Intrinsic Motivation for Reinforcement Learning [60.322878138199364]
We investigate whether the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution can be utilized effectively for reinforcement learning tasks. Our approach, termed Adversarial Intrinsic Motivation (AIM), estimates this Wasserstein-1 distance through its dual objective and uses it to compute a supplemental reward function.
arXiv Detail & Related papers (2021-05-27T17:51:34Z)
Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors. In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency. We propose setting up an automatic curriculum for goals that the agent needs to solve. We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.