Related papers: Keep Rehearsing and Refining: Lifelong Learning Vehicle Routing under Continually Drifting Tasks

Keep Rehearsing and Refining: Lifelong Learning Vehicle Routing under Continually Drifting Tasks

URL: http://arxiv.org/abs/2601.22509v1
Date: Fri, 30 Jan 2026 03:37:39 GMT
Title: Keep Rehearsing and Refining: Lifelong Learning Vehicle Routing under Continually Drifting Tasks
Authors: Jiyuan Pei, Yi Mei, Jialin Liu, Mengjie Zhang, Xin Yao,
Abstract summary: We study a novel lifelong learning paradigm for neural VRP solvers under continually drifting tasks over learning time steps.<n>We propose Dual Replay with Experience Enhancement (DREE), a general framework to improve learning efficiency and mitigate catastrophic forgetting under such drift.
Score: 8.939294630058729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing neural solvers for vehicle routing problems (VRPs) are typically trained either in a one-off manner on a fixed set of pre-defined tasks or in a lifelong manner on several tasks arriving sequentially, assuming sufficient training on each task. Both settings overlook a common real-world property: problem patterns may drift continually over time, yielding massive tasks sequentially arising while offering only limited training resources per task. In this paper, we study a novel lifelong learning paradigm for neural VRP solvers under continually drifting tasks over learning time steps, where sufficient training for any given task at any time is not available. We propose Dual Replay with Experience Enhancement (DREE), a general framework to improve learning efficiency and mitigate catastrophic forgetting under such drift. Extensive experiments show that, under such continual drift, DREE effectively learns new tasks, preserves prior knowledge, improves generalization to unseen tasks, and can be applied to diverse existing neural solvers.

Related papers

Lifelong Learning with Behavior Consolidation for Vehicle Routing [8.939294630058729]
This paper explores a novel lifelong learning paradigm for neural VRP solvers.<n>LLR-BC consolidates prior knowledge effectively by aligning behaviors of the solver trained on a new task with the buffered ones.<n>Experiments on capacitated vehicle routing problems and traveling salesman problems demonstrate LLR-BC's effectiveness in training high-performance neural solvers.
arXiv Detail & Related papers (2025-09-26T02:03:48Z)
Dense Dynamics-Aware Reward Synthesis: Integrating Prior Experience with Demonstrations [24.041217922654738]
Continuous control problems can be formulated as sparse-reward reinforcement learning (RL) tasks.<n>Online RL methods can automatically explore the state space to solve each new task.<n>However, discovering sequences of actions that lead to a non-zero reward becomes exponentially more difficult as the task horizon increases.<n>We introduce a systematic reward-shaping framework that distills the information contained in 1) a task-agnostic prior data set and 2) a small number of task-specific expert demonstrations.
arXiv Detail & Related papers (2024-12-02T04:37:12Z)
Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation [0.0]
The Task-Agnostic Policy Distillation (TAPD) framework is introduced. This paper addresses the problem of continual learning. By utilizing task-agnostic distilled knowledge, the agent can solve downstream tasks more efficiently.
arXiv Detail & Related papers (2024-11-25T16:18:39Z)
Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal [54.93261535899478]
In real-world applications, such as robotic control of reinforcement learning, the tasks are changing, and new tasks arise in a sequential order.<n>This situation poses the new challenge of plasticity-stability trade-off for training an agent who can adapt to task changes and retain acquired knowledge.<n>We propose a rehearsal-based continual diffusion model, called Continual diffuser (CoD), to endow the diffuser with the capabilities of quick adaptation (plasticity) and lasting retention (stability)
arXiv Detail & Related papers (2024-09-04T08:21:47Z)
Replay-enhanced Continual Reinforcement Learning [37.34722105058351]
We introduce RECALL, a replay-enhanced method that greatly improves the plasticity of existing replay-based methods on new tasks. Experiments on the Continual World benchmark show that RECALL performs significantly better than purely perfect memory replay.
arXiv Detail & Related papers (2023-11-20T06:21:52Z)
CLUTR: Curriculum Learning via Unsupervised Task Representation Learning [130.79246770546413]
CLUTR is a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization. We show CLUTR outperforms PAIRED, a principled and popular UED method, in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.
arXiv Detail & Related papers (2022-10-19T01:45:29Z)
Generalizing to New Tasks via One-Shot Compositional Subgoals [23.15624959305799]
The ability to generalize to previously unseen tasks with little to no supervision is a key challenge in modern machine learning research. We introduce CASE which attempts to address these issues by training an Imitation Learning agent using adaptive "near future" subgoals. Our experiments show that the proposed approach consistently outperforms the previous state-of-the-art compositional Imitation Learning approach by 30%.
arXiv Detail & Related papers (2022-05-16T14:30:11Z)
Relational Experience Replay: Continual Learning by Adaptively Tuning Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff. ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z)
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention [67.1936055742498]
We show that multi-task learning can effectively scale reset-free learning schemes to much more complex problems. This work shows the ability to learn dexterous manipulation behaviors in the real world with RL without any human intervention.
arXiv Detail & Related papers (2021-04-22T17:38:27Z)
Continual Learning of Control Primitives: Skill Discovery via Reset-Games [128.36174682118488]
We show how a single method can allow an agent to acquire skills with minimal supervision. We do this by exploiting the insight that the need to "reset" an agent to a broad set of initial states for a learning task provides a natural setting to learn a diverse set of "reset-skills"
arXiv Detail & Related papers (2020-11-10T18:07:44Z)
Planning to Explore via Self-Supervised World Models [120.31359262226758]
Plan2Explore is a self-supervised reinforcement learning agent. We present a new approach to self-supervised exploration and fast adaptation to new tasks. Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods.
arXiv Detail & Related papers (2020-05-12T17:59:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.