Robust Imitation Learning for Automated Game Testing
- URL: http://arxiv.org/abs/2401.04572v1
- Date: Tue, 9 Jan 2024 14:18:25 GMT
- Title: Robust Imitation Learning for Automated Game Testing
- Authors: Pierluigi Vito Amadori, Timothy Bradley, Ryan Spick, Guy Moss
- Abstract summary: We propose EVOLUTE, a novel imitation learning-based architecture that combines behavioural cloning (BC) with energy based models (EBMs)
EVOLUTE is a two-stream ensemble model that splits the action space of autonomous agents into continuous and discrete tasks.
We evaluate the performance of EVOLUTE in a shooting-and-driving game, where the agent is required to navigate and continuously identify targets to attack.
- Score: 1.6385815610837167
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Game development is a long process that involves many stages before a product
is ready for the market. Human play testing is among the most time consuming,
as testers are required to repeatedly perform tasks in the search for errors in
the code. Therefore, automated testing is seen as a key technology for the
gaming industry, as it would dramatically improve development costs and
efficiency. Toward this end, we propose EVOLUTE, a novel imitation
learning-based architecture that combines behavioural cloning (BC) with energy
based models (EBMs). EVOLUTE is a two-stream ensemble model that splits the
action space of autonomous agents into continuous and discrete tasks. The EBM
stream handles the continuous tasks, to have a more refined and adaptive
control, while the BC stream handles discrete actions, to ease training. We
evaluate the performance of EVOLUTE in a shooting-and-driving game, where the
agent is required to navigate and continuously identify targets to attack. The
proposed model has higher generalisation capabilities than standard BC
approaches, showing a wider range of behaviours and higher performances. Also,
EVOLUTE is easier to train than a pure end-to-end EBM model, as discrete tasks
can be quite sparse in the dataset and cause model training to explore a much
wider set of possible actions while training.
Related papers
- Self-evolved Imitation Learning in Simulated World [16.459715139048367]
Self-Evolved Imitation Learning (SEIL) is a framework that progressively improves a few-shot model through simulator interactions.<n>SEIL achieves a new state-of-the-art performance in few-shot imitation learning scenarios.
arXiv Detail & Related papers (2025-09-23T18:15:32Z) - Self-Improving Embodied Foundation Models [21.81624145902717]
We propose a two-stage post-training approach for robotics.<n>The first stage, Supervised Fine-Tuning (SFT), fine-tunes pretrained foundation models using both: a) behavioral cloning, and b) steps-to-go prediction objectives.<n>In the second stage, Self-Improvement, steps-to-go prediction enables the extraction of a well-shaped reward function and a robust success detector.
arXiv Detail & Related papers (2025-09-18T17:00:08Z) - VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning [14.099306230721245]
We present VLA-RL, an exploration-based framework that improves on online collected data at test time.<n>We fine-tune a pretrained vision-language model as a robotic process reward model, which is trained on pseudo reward labels annotated on automatically extracted task segments.<n>VLA-RL enables OpenVLA-7B to surpass the strongest finetuned baseline by 4.5% on 40 challenging robotic manipulation tasks in LIBERO.
arXiv Detail & Related papers (2025-05-24T14:42:51Z) - Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks.
We introduce a generative framework leveraging flow matching for online robot dynamics model alignment.
We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z) - Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal [54.93261535899478]
In real-world applications, such as robotic control of reinforcement learning, the tasks are changing, and new tasks arise in a sequential order.
This situation poses the new challenge of plasticity-stability trade-off for training an agent who can adapt to task changes and retain acquired knowledge.
We propose a rehearsal-based continual diffusion model, called Continual diffuser (CoD), to endow the diffuser with the capabilities of quick adaptation (plasticity) and lasting retention (stability)
arXiv Detail & Related papers (2024-09-04T08:21:47Z) - SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation [62.58480650443393]
Segment Anything (SAM) is a vision-foundation model for generalizable scene understanding and sequence imitation.
We develop a novel multi-channel heatmap that enables the prediction of the action sequence in a single pass.
arXiv Detail & Related papers (2024-05-30T00:32:51Z) - GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot [27.410618312830497]
In this paper, we propose GeRM (Generalist Robotic Model)
We utilize offline reinforcement learning to optimize data utilization strategies.
We employ a transformer-based VLA network to process multi-modal inputs and output actions.
arXiv Detail & Related papers (2024-03-20T07:36:43Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - PASTA: Pretrained Action-State Transformer Agents [10.654719072766495]
Self-supervised learning has brought about a revolutionary paradigm shift in various computing domains.
Recent approaches involve pre-training transformer models on vast amounts of unlabeled data.
In reinforcement learning, researchers have recently adapted these approaches, developing models pre-trained on expert trajectories.
arXiv Detail & Related papers (2023-07-20T15:09:06Z) - EUCLID: Towards Efficient Unsupervised Reinforcement Learning with
Multi-choice Dynamics Model [46.99510778097286]
Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment.
We introduce a novel model-fused paradigm to jointly pre-train the dynamics model and unsupervised exploration policy in the pre-training phase.
We show that EUCLID achieves state-of-the-art performance with high sample efficiency.
arXiv Detail & Related papers (2022-10-02T12:11:44Z) - Effective Adaptation in Multi-Task Co-Training for Unified Autonomous
Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks.
We find that their performances are sub-optimal or even lag far behind the single-task baseline.
We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Continual Model-Based Reinforcement Learning with Hypernetworks [24.86684067407964]
We propose a method that continually learns encountered dynamics in a sequence of tasks using task-conditional hypernetworks.
Our method has three main attributes: first, it includes dynamics learning sessions that do not revisit training data from previous tasks, so it only needs to store the most recent fixed-size portion of the state transition experience.
We show that HyperCRL is effective in continual model-based reinforcement learning in robot locomotion and manipulation scenarios.
arXiv Detail & Related papers (2020-09-25T01:46:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.