Fugu-MT 論文翻訳(概要): Aligning Inductive Bias for Data-Efficient Generalization in State Space Models

論文の概要: Aligning Inductive Bias for Data-Efficient Generalization in State Space Models

arxiv url: http://arxiv.org/abs/2509.20789v2
Date: Fri, 26 Sep 2025 05:57:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 12:12:20.348979
Title: Aligning Inductive Bias for Data-Efficient Generalization in State Space Models
Title（参考訳）: 状態空間モデルにおけるデータ効率の良い一般化のための誘導バイアスの調整
Authors: Qiyu Chen, Guozhang Chen,
Abstract要約: モデリングにおける次のフロンティアの1つは、データ効率である。この問題を解決するための原則的枠組みを導入する。本稿では,タスク依存初期化(TDI: Task-Dependent Initialization: タスク依存初期化)手法を提案する。
参考スコア（独自算出の注目度）: 3.9891133943171546
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The remarkable success of large-scale models is fundamentally tied to scaling laws, yet the finite nature of high-quality data presents a looming challenge. One of the next frontiers in modeling is data efficiency: the ability to learn more from less. A model's inductive bias is a critical lever for this, but foundational sequence models like State Space Models (SSMs) rely on a fixed bias. This fixed prior is sample-inefficient when a task's underlying structure does not match. In this work, we introduce a principled framework to solve this problem. We first formalize the inductive bias of linear time-invariant SSMs through an SSM-induced kernel, mathematically and empirically proving its spectrum is directly governed by the model's frequency response. Further, we propose a method of Task-Dependent Initialization (TDI): power spectrum matching, a fast and efficient method that aligns the model's inductive bias with the task's spectral characteristics before large-scale training. Our experiments on a diverse set of real-world benchmarks show that TDI significantly improves generalization and sample efficiency, particularly in low-data regimes. This work provides a theoretical and practical tool to create more data-efficient models, a crucial step towards sustainable scaling.
Abstract（参考訳）: 大規模モデルの顕著な成功は、基本的にはスケーリング法則に結びついているが、高品質なデータの有限な性質は、悲惨な課題である。モデリングにおける次のフロンティアの1つは、データ効率である。モデルの帰納バイアスは、このために重要なレバーであるが、ステートスペースモデル(SSM)のような基礎的なシーケンスモデルは、固定バイアスに依存している。この固定された事前は、タスクの基盤構造が一致しない場合、サンプリング非効率である。本稿では,この問題を解決するための原則的枠組みを紹介する。まず、線形時間不変SSMの帰納バイアスをSSM誘起カーネルで定式化し、そのスペクトルがモデルの周波数応答によって直接支配されていることを数学的かつ経験的に証明する。さらに,タスク依存初期化(TDI: Task-Dependent Initialization)手法を提案する。大規模なトレーニングの前に,モデルの帰納バイアスとタスクのスペクトル特性を一致させる高速かつ効率的な手法である。実世界の多種多様なベンチマーク実験により、TDIは一般化とサンプル効率、特に低データレシエーションにおいて著しく向上することが示された。この作業は、よりデータ効率のよいモデルを作成するための理論的で実践的なツールを提供する。

論文の概要: Aligning Inductive Bias for Data-Efficient Generalization in State Space Models

関連論文リスト