Meta-World Conditional Neural Processes
- URL: http://arxiv.org/abs/2302.10320v1
- Date: Mon, 20 Feb 2023 21:18:02 GMT
- Title: Meta-World Conditional Neural Processes
- Authors: Suzan Ece Ada, Emre Ugur
- Abstract summary: We propose Meta-World Conditional Neural Processes (MW-CNP) to enable an agent to sample from its own "hallucination"
MW-CNP is trained on offline interaction data logged during meta-training.
- Score: 2.627046865670577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Meta-World Conditional Neural Processes (MW-CNP), a conditional
world model generator that leverages sample efficiency and scalability of
Conditional Neural Processes to enable an agent to sample from its own
"hallucination". We intend to reduce the agent's interaction with the target
environment at test time as much as possible. To reduce the number of samples
required at test time, we first obtain a latent representation of the
transition dynamics from a single rollout from the test environment with hidden
parameters. Then, we obtain rollouts for few-shot learning by interacting with
the "hallucination" generated by the meta-world model. Using the world model
representation from MW-CNP, the meta-RL agent can adapt to an unseen target
environment with significantly fewer samples collected from the target
environment compared to the baselines. We emphasize that the agent does not
have access to the task parameters throughout training and testing, and MW-CNP
is trained on offline interaction data logged during meta-training.
Related papers
- Unsupervised Meta-Testing with Conditional Neural Processes for Hybrid Meta-Reinforcement Learning [1.9336815376402723]
Unsupervised Meta-Testing with Conditional Neural Processes (UMCNP) is a novel hybrid few-shot meta-reinforcement learning (meta-RL) method.<n>We demonstrate our method can adapt to an unseen test task using significantly fewer samples during meta-testing than the baselines in 2D-Point Agent and continuous control meta-RL benchmarks.
arXiv Detail & Related papers (2025-06-04T19:27:47Z) - Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling.
We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process.
Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z) - EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks [14.046487518350792]
Spiking Neural Networks (SNNs) operate on an event-driven through sparse spike communication.
We introduce Residual Potential Dropout (RPD) and Spike-Aware Training (SAT) to regulate potential distribution.
Our method yields a 4.4% mAP improvement on the Gen1 dataset, while requiring 38% fewer parameters and only three time steps.
arXiv Detail & Related papers (2024-03-19T09:34:11Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - Continual Test-time Domain Adaptation via Dynamic Sample Selection [38.82346845855512]
This paper proposes a Dynamic Sample Selection (DSS) method for Continual Test-time Domain Adaptation (CTDA)
We apply joint positive and negative learning on both high- and low-quality samples to reduce the risk of using wrong information.
Our approach is also evaluated in the 3D point cloud domain, showcasing its versatility and potential for broader applicability.
arXiv Detail & Related papers (2023-10-05T06:35:21Z) - Leveraging World Model Disentanglement in Value-Based Multi-Agent
Reinforcement Learning [18.651307543537655]
We propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentangled World Model.
We present experimental results in Easy, Hard, and Super-Hard StarCraft II micro-management challenges to demonstrate that our method achieves high sample efficiency and exhibits superior performance in defeating the enemy armies compared to other baselines.
arXiv Detail & Related papers (2023-09-08T22:12:43Z) - Neural-Sim: Learning to Generate Training Data with NeRF [31.81496344354997]
We present the first fully differentiable synthetic data pipeline that uses Neural Radiance Fields (NeRFs) in a closed-loop with a target application's loss function.
Our approach generates data on-demand, with no human labor, to maximize accuracy for a target task.
arXiv Detail & Related papers (2022-07-22T22:48:33Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Energy-Efficient and Federated Meta-Learning via Projected Stochastic
Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework.
We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - A Provably Efficient Sample Collection Strategy for Reinforcement
Learning [123.69175280309226]
One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior.
We propose to tackle the exploration-exploitation problem following a decoupled approach composed of: 1) An "objective-specific" algorithm that prescribes how many samples to collect at which states, as if it has access to a generative model (i.e., sparse simulator of the environment); 2) An "objective-agnostic" sample collection responsible for generating the prescribed samples as fast as possible.
arXiv Detail & Related papers (2020-07-13T15:17:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.