Related papers: Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

URL: http://arxiv.org/abs/2011.14487v2
Date: Sun, 7 Mar 2021 04:59:11 GMT
Title: Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp
Authors: Junfan Lin, Zhongzhan Huang, Keze Wang, Xiaodan Liang, Weiwei Chen, and Liang Lin
Abstract summary: This paper introduces a concise yet powerful method to construct Continuous Transition. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
Score: 119.69304125647785
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although deep reinforcement learning (RL) has been successfully applied to a variety of robotic control tasks, it's still challenging to apply it to real-world tasks, due to the poor sample efficiency. Attempting to overcome this shortcoming, several works focus on reusing the collected trajectory data during the training by decomposing them into a set of policy-irrelevant discrete transitions. However, their improvements are somewhat marginal since i) the amount of the transitions is usually small, and ii) the value assignment only happens in the joint states. To address these issues, this paper introduces a concise yet powerful method to construct Continuous Transition, which exploits the trajectory information by exploiting the potential transitions along the trajectory. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically. Extensive experiments demonstrate that our proposed method achieves a significant improvement in sample efficiency on various complex continuous robotic control problems in MuJoCo and outperforms the advanced model-based / model-free RL methods. The source code is available.

Related papers

Enhancing Cross-task Transfer of Large Language Models via Activation Steering [75.41750053623298]
Cross-task in-context learning offers a direct solution for transferring knowledge across tasks.<n>We investigate whether cross-task transfer can be achieved via latent space steering without parameter updates or input expansion.<n>We propose a novel Cross-task Activation Steering Transfer framework that enables effective transfer by manipulating the model's internal activation states.
arXiv Detail & Related papers (2025-07-17T15:47:22Z)
Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional [2.010573982216398]
Current machine learning approaches use expensive, task-specific, and data-free training procedures. We demonstrate our approach on varied molecular systems, obtaining diverse, physically realistic transition pathways. Our method can be easily incorporated into new generative models, making it practically relevant as models continue to scale.
arXiv Detail & Related papers (2025-04-25T17:17:17Z)
Offline Reinforcement Learning from Datasets with Structured Non-Stationarity [50.35634234137108]
Current Reinforcement Learning (RL) is often limited by the large amount of data needed to learn a successful policy. We address a novel Offline RL problem setting in which, while collecting the dataset, the transition and reward functions gradually change between episodes but stay constant within each episode. We propose a method based on Contrastive Predictive Coding that identifies this non-stationarity in the offline dataset, accounts for it when training a policy, and predicts it during evaluation.
arXiv Detail & Related papers (2024-05-23T02:41:36Z)
Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer [21.57847333976567]
Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer. We propose Prompt Tuning with Positive Forward Transfer (Fwd-Prompt) to address these issues.
arXiv Detail & Related papers (2024-01-17T12:44:17Z)
Solving Continual Offline Reinforcement Learning with Decision Transformer [78.59473797783673]
Continuous offline reinforcement learning (CORL) combines continuous and offline reinforcement learning. Existing methods, employing Actor-Critic structures and experience replay (ER), suffer from distribution shifts, low efficiency, and weak knowledge-sharing. We introduce multi-head DT (MH-DT) and low-rank adaptation DT (LoRA-DT) to mitigate DT's forgetting problem.
arXiv Detail & Related papers (2024-01-16T16:28:32Z)
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments. Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z)
Contrastive Example-Based Control [163.6482792040079]
We propose a method for offline, example-based control that learns an implicit model of multi-step transitions, rather than a reward function. Across a range of state-based and image-based offline control tasks, our method outperforms baselines that use learned reward functions.
arXiv Detail & Related papers (2023-07-24T19:43:22Z)
Real-time Controllable Motion Transition for Characters [14.88407656218885]
Real-time in-between motion generation is universally required in games and highly desirable in existing animation pipelines. Our approach consists of two key components: motion manifold and conditional transitioning. We show that our method is able to generate high-quality motions measured under multiple metrics.
arXiv Detail & Related papers (2022-05-05T10:02:54Z)
Transition Motion Tensor: A Data-Driven Approach for Versatile and Controllable Agents in Physically Simulated Environments [6.8438089867929905]
This paper proposes a data-driven framework that creates novel and physically accurate transitions outside of the motion dataset. It enables simulated characters to adopt new motion skills efficiently and robustly without modifying existing ones.
arXiv Detail & Related papers (2021-11-30T02:17:25Z)
Adversarial Imitation Learning with Trajectorial Augmentation and Correction [61.924411952657756]
We introduce a novel augmentation method which preserves the success of the augmented trajectories. We develop an adversarial data augmented imitation architecture to train an imitation agent using synthetic experts. Experiments show that our data augmentation strategy can improve accuracy and convergence time of adversarial imitation.
arXiv Detail & Related papers (2021-03-25T14:49:32Z)
Data-efficient Weakly-supervised Learning for On-line Object Detection under Domain Shift in Robotics [24.878465999976594]
Several object detection methods have been proposed in the literature, the vast majority based on Deep Convolutional Neural Networks (DCNNs) These methods have important limitations for robotics: Learning solely on off-line data may introduce biases, and prevents adaptation to novel tasks. In this work, we investigate how weakly-supervised learning can cope with these problems.
arXiv Detail & Related papers (2020-12-28T16:36:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.