Fugu-MT 論文翻訳(概要): SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

論文の概要: SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

arxiv url: http://arxiv.org/abs/2603.08763v1
Date: Mon, 09 Mar 2026 03:38:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:23.734277
Title: SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning
Title（参考訳）: SPREAD:生涯模擬学習のための部分空間表現蒸留
Authors: Kaushik Roy, Giovanni D'urso, Nicholas Lawrance, Brendan Tidd, Peyman Moghadam,
Abstract要約: 生涯の模倣学習における重要な課題は、エージェントが事前知識を維持しながら専門家のデモンストレーションから新しいスキルを習得できるようにすることである。生の特徴空間におけるL2-ノルム特徴マッチングに依存する既存の蒸留法は,ノイズや高次元変動に敏感である。低ランク部分空間内のタスク間でポリシー表現を整合させるために特異値分解を利用する幾何保存フレームワークSPREADを導入する。
参考スコア（独自算出の注目度）: 11.023696977257883
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A key challenge in lifelong imitation learning (LIL) is enabling agents to acquire new skills from expert demonstrations while retaining prior knowledge. This requires preserving the low-dimensional manifolds and geometric structures that underlie task representations across sequential learning. Existing distillation methods, which rely on L2-norm feature matching in raw feature space, are sensitive to noise and high-dimensional variability, often failing to preserve intrinsic task manifolds. To address this, we introduce SPREAD, a geometry-preserving framework that employs singular value decomposition (SVD) to align policy representations across tasks within low-rank subspaces. This alignment maintains the underlying geometry of multimodal features, facilitating stable transfer, robustness, and generalization. Additionally, we propose a confidence-guided distillation strategy that applies a Kullback-Leibler divergence loss restricted to the top-M most confident action samples, emphasizing reliable modes and improving optimization stability. Experiments on the LIBERO, lifelong imitation learning benchmark, show that SPREAD substantially improves knowledge transfer, mitigates catastrophic forgetting, and achieves state-of-the-art performance.
Abstract（参考訳）: 生涯模倣学習(LIL)における重要な課題は、エージェントが事前知識を維持しながら専門家のデモンストレーションから新しいスキルを習得できるようにすることである。これは、シーケンシャルラーニング全体にわたってタスク表現の基盤となる低次元多様体と幾何学的構造を保存する必要がある。生の特徴空間におけるL2-ノルム特徴マッチングに依存する既存の蒸留法は、ノイズや高次元変動に敏感であり、しばしば本質的なタスク多様体の保存に失敗する。この問題に対処するため,SPREADは特異値分解(SVD)を用いて,低ランク部分空間内のタスク間でポリシー表現を整合させる幾何学保存フレームワークである。このアライメントはマルチモーダル特徴の基本的な幾何学を維持し、安定移動、堅牢性、一般化を促進する。さらに,Kulback-Leibler分散損失を最上位の信頼性動作サンプルに限定し,信頼性モードを強調し,最適化安定性を向上させる信頼性誘導蒸留手法を提案する。生涯の模倣学習ベンチマークであるLIBEROの実験は、SPREADが知識伝達を大幅に改善し、破滅的な忘れを軽減し、最先端のパフォーマンスを達成することを示した。

論文の概要: SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

関連論文リスト