Fugu-MT 論文翻訳(概要): TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning

論文の概要: TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning

arxiv url: http://arxiv.org/abs/2604.25898v1
Date: Tue, 28 Apr 2026 17:41:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-29 16:49:17.980855
Title: TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning
Title（参考訳）: TSN親和性:連続オフライン強化学習における類似性駆動型パラメータ再利用
Authors: Dominik Żurek, Kamil Faber, Marcin Pietron, Paweł Gajewski, Roberto Corizzo,
Abstract要約: 連続オフライン強化学習(CORL)は、以前に学習したタスクのパフォーマンスを維持しながら、時間とともに収集されたデータセットから一連のタスクを学習することを目的としている。本稿では,TinySubNetworks と Decision Transformer に基づく新しい CORL 手法である TSN-Affinity を提案する。我々は,Atariゲームに基づくベンチマークのアプローチと,Franka Emika Pandaロボットアームによる操作タスクのシミュレーションを評価する。
参考スコア（独自算出の注目度）: 5.680044533158534
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual offline reinforcement learning (CORL) aims to learn a sequence of tasks from datasets collected over time while preserving performance on previously learned tasks. This setting corresponds to domains where new tasks arise over time, but adapting the model in live environment interactions is expensive, risky, or impossible. However, CORL inherits the dual difficulty of offline reinforcement learning and adapting while preventing catastrophic forgetting. Replay-based continual learning approaches remain a strong baseline but incur memory overhead and suffer from a distribution mismatch between replayed samples and newly learned policies. At the same time, architectural continual learning methods have shown strong potential in supervised learning but remain underexplored in CORL. In this work, we propose TSN-Affinity, a novel CORL method based on TinySubNetworks and Decision Transformer. The method enables task-specific parameterization and controlled knowledge sharing through a RL-aware reuse strategy that routes tasks according to action compatibility and latent similarity. We evaluate the approach on benchmarks based on Atari games and simulations of manipulation tasks with the Franka Emika Panda robotic arm, covering both discrete and continuous control. Results show strong retention from sparse SubNetworks, with routing further improving multi-task performance. Our findings suggest that similarity-guided architectural reuse is a strong and viable alternative to replay-based strategies in a CORL setting. Our code is available at: https://github.com/anonymized-for-submission123/tsn-affinity.
Abstract（参考訳）: 連続オフライン強化学習(CORL)は、以前に学習したタスクのパフォーマンスを維持しながら、時間とともに収集されたデータセットから一連のタスクを学習することを目的としている。この設定は、時間とともに新しいタスクが発生するドメインに対応しますが、ライブ環境のインタラクションにおけるモデルの適用は、高価でリスクが高く、あるいは不可能です。しかし、CORLは、破滅的な忘れ込みを防止しつつ、オフライン強化学習と適応の二重困難を継承する。リプレイベースの連続学習アプローチは、強力なベースラインのままだが、メモリオーバーヘッドが生じ、リプレイされたサンプルと新しく学習されたポリシー間の分散ミスマッチに悩まされる。同時に、アーキテクチャ連続学習手法は教師あり学習において強い可能性を示しているが、CORLでは未探索のままである。本稿では,TinySubNetworks と Decision Transformer に基づく新しい CORL 手法である TSN-Affinity を提案する。タスク固有のパラメータ化と制御された知識共有を、アクション互換性と潜在類似性に応じてタスクをルーティングするRL対応の再利用戦略により実現する。我々は,AtariゲームとFranka Emika Pandaロボットアームによる操作タスクのシミュレーションに基づくベンチマークのアプローチを評価し,離散制御と連続制御の両方を網羅した。結果は、マルチタスク性能をさらに向上させるとともに、疎いSubNetworksからの強い保持力を示している。以上の結果から,類似性誘導型アーキテクチャ再利用は,CORL環境下でのリプレイ型戦略の強力な代替手段である可能性が示唆された。私たちのコードは、https://github.com/anonymized-for-submission123/tsn-affinityで利用可能です。

論文の概要: TSN-Affinity: Similarity-Driven Parameter Reuse for Continual Offline Reinforcement Learning

関連論文リスト