Learning from Observation: A Survey of Recent Advances
- URL: http://arxiv.org/abs/2509.19379v1
- Date: Sat, 20 Sep 2025 05:44:02 GMT
- Title: Learning from Observation: A Survey of Recent Advances
- Authors: Returaj Burnwal, Hriday Mehta, Nirav Pravinbhai Bhatt, Balaraman Ravindran,
- Abstract summary: Imitation Learning (IL) algorithms mimic an expert's behavior without requiring a reward function.<n>The concept of learning from observation (LfO) or state-only imitation learning (SOIL) has recently gained attention.<n>We present a framework for LfO and use it to survey and classify existing LfO methods in terms of their trajectory construction, assumptions and algorithm's design choices.
- Score: 5.056582586833467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation Learning (IL) algorithms offer an efficient way to train an agent by mimicking an expert's behavior without requiring a reward function. IL algorithms often necessitate access to state and action information from expert demonstrations. Although expert actions can provide detailed guidance, requiring such action information may prove impractical for real-world applications where expert actions are difficult to obtain. To address this limitation, the concept of learning from observation (LfO) or state-only imitation learning (SOIL) has recently gained attention, wherein the imitator only has access to expert state visitation information. In this paper, we present a framework for LfO and use it to survey and classify existing LfO methods in terms of their trajectory construction, assumptions and algorithm's design choices. This survey also draws connections between several related fields like offline RL, model-based RL and hierarchical RL. Finally, we use our framework to identify open problems and suggest future research directions.
Related papers
- Executable Knowledge Graphs for Replicating AI Research [65.41207324831583]
Executable Knowledge Graphs (xKG) is a modular and pluggable knowledge base that automatically integrates technical insights, code snippets, and domain-specific knowledge extracted from scientific literature.<n>Code will released at https://github.com/zjunlp/xKG.
arXiv Detail & Related papers (2025-10-20T17:53:23Z) - A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications [41.610769640632334]
Retrieval-Augmented Generation (RAG) mitigates issues by grounding model outputs in external evidence.<n>Recent advances in agentic search address these limitations by enabling LLMs to plan, retrieve, and reflect through multi-step interaction with search environments.<n>Within this paradigm, reinforcement learning (RL) offers a powerful mechanism for adaptive and self-improving search behavior.
arXiv Detail & Related papers (2025-10-19T06:04:53Z) - R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models.<n>Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start.<n>Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Imitation Learning from Observation through Optimal Transport [25.398983671932154]
Imitation Learning from Observation (ILfO) is a setting in which a learner tries to imitate the behavior of an expert.
We show that existing methods can be simplified to generate a reward function without requiring learned models or adversarial learning.
We demonstrate the effectiveness of this simple approach on a variety of continuous control tasks and find that it surpasses the state of the art in the IlfO setting.
arXiv Detail & Related papers (2023-10-02T20:53:20Z) - Inapplicable Actions Learning for Knowledge Transfer in Reinforcement
Learning [3.194414753332705]
We show that learning inapplicable actions greatly improves the sample efficiency of RL algorithms.
Thanks to the transferability of the knowledge acquired, it can be reused in other tasks and domains to make the learning process more efficient.
arXiv Detail & Related papers (2022-11-28T17:45:39Z) - A Survey on Explainable Reinforcement Learning: Concepts, Algorithms, Challenges [51.699348215510575]
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal.<n>Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential.<n>To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability.
arXiv Detail & Related papers (2022-11-12T13:52:06Z) - Redefining Counterfactual Explanations for Reinforcement Learning:
Overview, Challenges and Opportunities [2.0341936392563063]
Most explanation methods for AI are focused on developers and expert users.
Counterfactual explanations offer users advice on what can be changed in the input for the output of the black-box model to change.
Counterfactuals are user-friendly and provide actionable advice for achieving the desired output from the AI system.
arXiv Detail & Related papers (2022-10-21T09:50:53Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z) - Explore, Discover and Learn: Unsupervised Discovery of State-Covering
Skills [155.11646755470582]
'Explore, Discover and Learn' (EDL) is an alternative approach to information-theoretic skill discovery.
We show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.
arXiv Detail & Related papers (2020-02-10T10:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.