Exploration-Driven Generative Interactive Environments
- URL: http://arxiv.org/abs/2504.02515v1
- Date: Thu, 03 Apr 2025 12:01:41 GMT
- Title: Exploration-Driven Generative Interactive Environments
- Authors: Nedko Savov, Naser Kazemi, Mohammad Mahdi, Danda Pani Paudel, Xi Wang, Luc Van Gool,
- Abstract summary: We focus on using many virtual environments for inexpensive, automatically collected interaction data.<n>We propose a training framework merely using a random agent in virtual environments.<n>Our agent is fully independent of environment-specific rewards and thus adapts easily to new environments.
- Score: 53.05314852577144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern world models require costly and time-consuming collection of large video datasets with action demonstrations by people or by environment-specific agents. To simplify training, we focus on using many virtual environments for inexpensive, automatically collected interaction data. Genie, a recent multi-environment world model, demonstrates simulation abilities of many environments with shared behavior. Unfortunately, training their model requires expensive demonstrations. Therefore, we propose a training framework merely using a random agent in virtual environments. While the model trained in this manner exhibits good controls, it is limited by the random exploration possibilities. To address this limitation, we propose AutoExplore Agent - an exploration agent that entirely relies on the uncertainty of the world model, delivering diverse data from which it can learn the best. Our agent is fully independent of environment-specific rewards and thus adapts easily to new environments. With this approach, the pretrained multi-environment model can quickly adapt to new environments achieving video fidelity and controllability improvement. In order to obtain automatically large-scale interaction datasets for pretraining, we group environments with similar behavior and controls. To this end, we annotate the behavior and controls of 974 virtual environments - a dataset that we name RetroAct. For building our model, we first create an open implementation of Genie - GenieRedux and apply enhancements and adaptations in our version GenieRedux-G. Our code and data are available at https://github.com/insait-institute/GenieRedux.
Related papers
- Gen-C: Populating Virtual Worlds with Generative Crowds [1.5293427903448022]
We introduce Gen-C, a generative model to automate the task of authoring high-level crowd behaviors.<n>Gen-C bypasses the labor-intensive and challenging task of collecting and annotating real crowd video data.<n>We demonstrate the effectiveness of our approach in two scenarios, a University Campus and a Train Station.
arXiv Detail & Related papers (2025-04-02T17:33:53Z) - Inter-environmental world modeling for continuous and compositional dynamics [7.01176359680407]
We introduce Lie Action, an unsupervised framework that learns continuous latent action representations to simulate across environments.
We demonstrate that WLA can be trained using only video frames and, with minimal or no action labels, can quickly adapt to new environments with novel action sets.
arXiv Detail & Related papers (2025-03-13T00:02:54Z) - Learning Generative Interactive Environments By Trained Agent Exploration [41.94295877935867]
We propose to improve the model by employing reinforcement learning based agents for data generation.
This approach produces diverse datasets that enhance the model's ability to adapt and perform well.
Our evaluation, including a replication of the Coinrun case study, shows that GenieRedux-G achieves superior visual fidelity and controllability.
arXiv Detail & Related papers (2024-09-10T12:00:40Z) - Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments [26.66666135624716]
We present Robot Utility Models (RUMs), a framework for training and deploying zero-shot robot policies.
RUMs can generalize to new environments without any finetuning.
We train five utility models for opening cabinet doors, opening drawers, picking up napkins, picking up paper bags, and reorienting fallen objects.
arXiv Detail & Related papers (2024-09-09T17:59:50Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Reward-Free Curricula for Training Robust World Models [37.13175950264479]
Learning world models from reward-free exploration is a promising approach, and enables policies to be trained using imagined experience for new tasks.
We address the novel problem of generating curricula in the reward-free setting to train robust world models.
We show that minimax regret can be connected to minimising the maximum error in the world model across environment instances.
This result informs our algorithm, WAKER: Weighted Acquisition of Knowledge across Environments for Robustness.
arXiv Detail & Related papers (2023-06-15T15:40:04Z) - Real-World Humanoid Locomotion with Reinforcement Learning [92.85934954371099]
We present a fully learning-based approach for real-world humanoid locomotion.
Our controller can walk over various outdoor terrains, is robust to external disturbances, and can adapt in context.
arXiv Detail & Related papers (2023-03-06T18:59:09Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - DriveGAN: Towards a Controllable High-Quality Neural Simulation [147.6822288981004]
We introduce a novel high-quality neural simulator referred to as DriveGAN.
DriveGAN achieves controllability by disentangling different components without supervision.
We train DriveGAN on multiple datasets, including 160 hours of real-world driving data.
arXiv Detail & Related papers (2021-04-30T15:30:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.