Related papers: Lifelong Learning with Behavior Consolidation for Vehicle Routing

Lifelong Learning with Behavior Consolidation for Vehicle Routing

URL: http://arxiv.org/abs/2509.21765v2
Date: Mon, 29 Sep 2025 03:24:05 GMT
Title: Lifelong Learning with Behavior Consolidation for Vehicle Routing
Authors: Jiyuan Pei, Yi Mei, Jialin Liu, Mengjie Zhang, Xin Yao,
Abstract summary: This paper explores a novel lifelong learning paradigm for neural VRP solvers.<n>LLR-BC consolidates prior knowledge effectively by aligning behaviors of the solver trained on a new task with the buffered ones.<n>Experiments on capacitated vehicle routing problems and traveling salesman problems demonstrate LLR-BC's effectiveness in training high-performance neural solvers.
Score: 8.939294630058729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent neural solvers have demonstrated promising performance in learning to solve routing problems. However, existing studies are primarily based on one-off training on one or a set of predefined problem distributions and scales, i.e., tasks. When a new task arises, they typically rely on either zero-shot generalization, which may be poor due to the discrepancies between the new task and the training task(s), or fine-tuning the pretrained solver on the new task, which possibly leads to catastrophic forgetting of knowledge acquired from previous tasks. This paper explores a novel lifelong learning paradigm for neural VRP solvers, where multiple tasks with diverse distributions and scales arise sequentially over time. Solvers are required to effectively and efficiently learn to solve new tasks while maintaining their performance on previously learned tasks. Consequently, a novel framework called Lifelong Learning Router with Behavior Consolidation (LLR-BC) is proposed. LLR-BC consolidates prior knowledge effectively by aligning behaviors of the solver trained on a new task with the buffered ones in a decision-seeking way. To encourage more focus on crucial experiences, LLR-BC assigns greater consolidated weights to decisions with lower confidence. Extensive experiments on capacitated vehicle routing problems and traveling salesman problems demonstrate LLR-BC's effectiveness in training high-performance neural solvers in a lifelong learning setting, addressing the catastrophic forgetting issue, maintaining their plasticity, and improving zero-shot generalization ability.

Related papers

Keep Rehearsing and Refining: Lifelong Learning Vehicle Routing under Continually Drifting Tasks [8.939294630058729]
We study a novel lifelong learning paradigm for neural VRP solvers under continually drifting tasks over learning time steps.<n>We propose Dual Replay with Experience Enhancement (DREE), a general framework to improve learning efficiency and mitigate catastrophic forgetting under such drift.
arXiv Detail & Related papers (2026-01-30T03:37:39Z)
Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning [17.236861687708096]
Attention-Guided Incremental Learning' (AGILE) is a rehearsal-based CL approach that incorporates compact task attention to effectively reduce interference between tasks. AGILE significantly improves generalization performance by mitigating task interference and outperforming rehearsal-based approaches in several CL scenarios.
arXiv Detail & Related papers (2024-05-22T20:29:15Z)
Efficient Rehearsal Free Zero Forgetting Continual Learning using Adaptive Weight Modulation [3.6683171094134805]
Continual learning involves acquiring knowledge of multiple tasks over an extended period. Most approaches to this problem seek a balance between maximizing performance on the new tasks and minimizing the forgetting of previous tasks. Our approach attempts to maximize the performance of the new task, while ensuring zero forgetting.
arXiv Detail & Related papers (2023-11-26T12:36:05Z)
Replay-enhanced Continual Reinforcement Learning [37.34722105058351]
We introduce RECALL, a replay-enhanced method that greatly improves the plasticity of existing replay-based methods on new tasks. Experiments on the Continual World benchmark show that RECALL performs significantly better than purely perfect memory replay.
arXiv Detail & Related papers (2023-11-20T06:21:52Z)
Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information. We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting. Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z)
Online Continual Learning via the Knowledge Invariant and Spread-out Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting. We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP) We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z)
ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [59.08197876733052]
Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks. Sometimes, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, known as negative transfer. ForkMerge is a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights.
arXiv Detail & Related papers (2023-01-30T02:27:02Z)
Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer [39.99577526417276]
In continual learning (CL) an agent can improve the learning performance of both a new task and old' tasks. Most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks. We propose a new CL method with Backward knowlEdge tRansfer (CUBER) for a fixed capacity neural network without data replay.
arXiv Detail & Related papers (2022-11-01T23:55:51Z)
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z)
Relational Experience Replay: Continual Learning by Adaptively Tuning Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff. ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning. The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior. Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.