Sample-Efficient Behavior Cloning Using General Domain Knowledge
- URL: http://arxiv.org/abs/2501.16546v1
- Date: Mon, 27 Jan 2025 22:40:11 GMT
- Title: Sample-Efficient Behavior Cloning Using General Domain Knowledge
- Authors: Feiyu Zhu, Jean Oh, Reid Simmons,
- Abstract summary: We use a large language model's coding capability to instantiate a policy structure based on expert domain knowledge expressed in natural language.
In experiments with lunar lander and car racing tasks, our approach learns to solve the tasks with as few as 5 demonstrations and is robust to action noise.
- Score: 11.81924485548362
- License:
- Abstract: Behavior cloning has shown success in many sequential decision-making tasks by learning from expert demonstrations, yet they can be very sample inefficient and fail to generalize to unseen scenarios. One approach to these problems is to introduce general domain knowledge, such that the policy can focus on the essential features and may generalize to unseen states by applying that knowledge. Although this knowledge is easy to acquire from the experts, it is hard to be combined with learning from individual examples due to the lack of semantic structure in neural networks and the time-consuming nature of feature engineering. To enable learning from both general knowledge and specific demonstration trajectories, we use a large language model's coding capability to instantiate a policy structure based on expert domain knowledge expressed in natural language and tune the parameters in the policy with demonstrations. We name this approach the Knowledge Informed Model (KIM) as the structure reflects the semantics of expert knowledge. In our experiments with lunar lander and car racing tasks, our approach learns to solve the tasks with as few as 5 demonstrations and is robust to action noise, outperforming the baseline model without domain knowledge. This indicates that with the help of large language models, we can incorporate domain knowledge into the structure of the policy, increasing sample efficiency for behavior cloning.
Related papers
- Knowledge Tagging with Large Language Model based Multi-Agent System [17.53518487546791]
This paper investigates the use of a multi-agent system to address the limitations of previous algorithms.
We highlight the significant potential of an LLM-based multi-agent system in overcoming the challenges that previous methods have encountered.
arXiv Detail & Related papers (2024-09-12T21:39:01Z) - Knowledge Tagging System on Math Questions via LLMs with Flexible Demonstration Retriever [48.5585921817745]
Large Language Models (LLMs) are used to automate the knowledge tagging task.
We show the strong performance of zero- and few-shot results over math questions knowledge tagging tasks.
By proposing a reinforcement learning-based demonstration retriever, we successfully exploit the great potential of different-sized LLMs.
arXiv Detail & Related papers (2024-06-19T23:30:01Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Does Your Model Think Like an Engineer? Explainable AI for Bearing Fault
Detection with Deep Learning [20.488966890562004]
In this work, we focus on the specific task of detecting faults in rolling element bearings from vibration signals.
We propose a novel and domain-specific feature attribution framework that allows us to evaluate how well the underlying logic of a model corresponds with expert reasoning.
arXiv Detail & Related papers (2023-10-19T17:58:11Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue
Systems [9.983102639594899]
Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications.
They lack domain-specific knowledge that does not naturally occur in pre-training data.
Previous studies augmented PLMs with symbolic knowledge for different downstream NLP tasks.
arXiv Detail & Related papers (2022-12-15T20:15:05Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Knowledge-driven Data Construction for Zero-shot Evaluation in
Commonsense Question Answering [80.60605604261416]
We propose a novel neuro-symbolic framework for zero-shot question answering across commonsense tasks.
We vary the set of language models, training regimes, knowledge sources, and data generation strategies, and measure their impact across tasks.
We show that, while an individual knowledge graph is better suited for specific tasks, a global knowledge graph brings consistent gains across different tasks.
arXiv Detail & Related papers (2020-11-07T22:52:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.