Related papers: Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning

URL: http://arxiv.org/abs/2511.04598v1
Date: Thu, 06 Nov 2025 17:51:11 GMT
Title: Environment Agnostic Goal-Conditioning, A Study of Reward-Free Autonomous Learning
Authors: Hampus Åström, Elin Anna Topp, Jacek Malec,
Abstract summary: We show that an agent can learn to solve tasks by selecting its own goals in an environment-agnostic way.<n>Our method is independent of the underlying off-policy learning algorithm.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we study how transforming regular reinforcement learning environments into goal-conditioned environments can let agents learn to solve tasks autonomously and reward-free. We show that an agent can learn to solve tasks by selecting its own goals in an environment-agnostic way, at training times comparable to externally guided reinforcement learning. Our method is independent of the underlying off-policy learning algorithm. Since our method is environment-agnostic, the agent does not value any goals higher than others, leading to instability in performance for individual goals. However, in our experiments, we show that the average goal success rate improves and stabilizes. An agent trained with this method can be instructed to seek any observations made in the environment, enabling generic training of agents prior to specific use cases.

Related papers

Autonomous Continual Learning of Computer-Use Agents for Environment Adaptation [57.65688895630163]
We introduce ACuRL, an Autonomous Curriculum Reinforcement Learning framework that continually adapts agents to specific environments with zero human data.<n>Our method effectively enables both intra-environment and cross-environment continual learning, yielding 4-22% performance gains without forgetting existing environments.
arXiv Detail & Related papers (2026-02-10T23:06:02Z)
Unsupervised Learning of Efficient Exploration: Pre-training Adaptive Policies via Self-Imposed Goals [0.0]
Unsupervised pre-training can equip reinforcement learning agents with prior knowledge and accelerate learning in downstream tasks.<n>We present ULEE, an unsupervised meta-learning method that combines an in-context learner with an adversarial goal-generation strategy.
arXiv Detail & Related papers (2026-01-27T17:10:29Z)
Agentic Knowledgeable Self-awareness [79.25908923383776]
KnowSelf is a data-centric approach that applies agents with knowledgeable self-awareness like humans.<n>Our experiments demonstrate that KnowSelf can outperform various strong baselines on different tasks and models with minimal use of external knowledge.
arXiv Detail & Related papers (2025-04-04T16:03:38Z)
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks. This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics. We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z)
Self-supervised network distillation: an effective approach to exploration in sparse reward environments [0.0]
Reinforcement learning can train an agent to behave in an environment according to a predesigned reward function. The solution to such a problem may be to equip the agent with an intrinsic motivation that will provide informed exploration. We present Self-supervised Network Distillation (SND), a class of intrinsic motivation algorithms based on the distillation error as a novelty indicator.
arXiv Detail & Related papers (2023-02-22T18:58:09Z)
Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning [28.808933152885874]
Unconditioned reinforcement learning aims to acquire skills without prior goal representations. The intuitive approach of training in another interaction-rich environment disrupts the trained skills in the target environment. We propose an unsupervised domain adaptation method to identify and acquire skills across dynamics.
arXiv Detail & Related papers (2021-10-25T14:40:48Z)
Learning with AMIGo: Adversarially Motivated Intrinsic Goals [63.680207855344875]
AMIGo is a goal-generating teacher that proposes Adversarially Motivated Intrinsic Goals. We show that our method generates a natural curriculum of self-proposed goals which ultimately allows the agent to solve challenging procedurally-generated tasks.
arXiv Detail & Related papers (2020-06-22T10:22:08Z)
Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors. In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency. We propose setting up an automatic curriculum for goals that the agent needs to solve. We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
Generating Automatic Curricula via Self-Supervised Active Domain Randomization [11.389072560141388]
We extend the self-play framework to jointly learn a goal and environment curriculum. Our method generates a coupled goal-task curriculum, where agents learn through progressively more difficult tasks and environment variations. Our results show that a curriculum of co-evolving the environment difficulty together with the difficulty of goals set in each environment provides practical benefits in the goal-directed tasks tested.
arXiv Detail & Related papers (2020-02-18T22:45:29Z)
Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal. In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals. We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.