Fugu-MT 論文翻訳(概要): OSDTW: Optimal Shared Depth and Task Weighting for Long-Tailed Recognition

論文の概要: OSDTW: Optimal Shared Depth and Task Weighting for Long-Tailed Recognition

arxiv url: http://arxiv.org/abs/2605.24969v1
Date: Sun, 24 May 2026 09:48:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:18.552774
Title: OSDTW: Optimal Shared Depth and Task Weighting for Long-Tailed Recognition
Title（参考訳）: OSDTW:長期音声認識のための最適共有深さとタスクウェイト
Authors: Chang Chu, Qingyue Zhang, Shao-Lun Huang, Junxiong Zheng,
Abstract要約: 長い尾の認識は、絶え間なく頭尾のトレードオフに悩まされる。本稿では,従来の単一ラベル認識問題をヘッドタスクとテールタスクに分割する,基本的タスク分割フレームワークを提案する。結果のKullback-Leibler分散に基づく一般化誤差は、加法定数までのタスクワイズ項の和として記述できることを示す。
参考スコア（独自算出の注目度）: 12.869661291857843
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Long-tailed recognition suffers from a persistent head--tail trade-off: improving tail performance often degrades head accuracy and can increase training instability. Despite strong empirical results from re-weighting, decoupled training, and multi-expert methods, key design choices about representation sharing between head and tail classes and supervision weighting across class groups remain largely heuristic. In this work, we propose OSDTW, a principled task-decomposition framework that partitions the original single-label recognition problem into a head task and a tail task, implemented with a shared encoder and task-specific decoders. To handle the mutual exclusivity and statistical dependence between the two label groups, we introduce a factorized model and show that the resulting Kullback--Leibler divergence-based generalization error can be written as the sum of task-wise terms up to an additive constant, yielding a well-defined task-wise objective. We further develop a three-stage training pipeline: independent task training to estimate task-wise optima and the Fisher information matrix, weighted joint training to learn a shared encoder, and branch assembly to construct the final decoupled model. Under a block-diagonal Fisher approximation, we derive a computable second-order expansion of the expected generalization error, decomposing it into encoder variance, encoder bias, and decoder variance. This bias--variance decomposition provides a computable proxy to select the shared depth and task weights, enabling efficient hyper-parameter search. Experiments on standard long-tailed benchmarks demonstrate the effectiveness of the proposed approach over strong baselines.
Abstract（参考訳）: 尾性能の改善は、しばしば頭部の精度を低下させ、訓練の不安定性を高める。再重み付け、疎結合トレーニング、マルチエキスパート手法による強い経験的結果にもかかわらず、頭と尾のクラス間の表現共有とクラスグループ間の監督重み付けに関する重要な設計選択は、大半がヒューリスティックなままである。そこで本研究では,従来の単一ラベル認識問題を,共有エンコーダとタスク固有デコーダで実装したヘッドタスクとテールタスクに分割する,基本的タスク分解フレームワークOSDTWを提案する。 2つのラベル群間の相互排他性と統計的依存を扱うために、分解モデルを導入し、結果のクルバック-リーバー分岐に基づく一般化誤差を、加法定数までのタスクワイズ項の和として記述し、適切に定義されたタスクワイズ目的を与えることを示す。さらに,タスクワイズ最適とフィッシャー情報行列を推定するための独立したタスクトレーニング,共有エンコーダを学習するための重み付き共同訓練,最終的な疎結合モデルを構築するための分岐組立という3段階のトレーニングパイプラインを開発する。ブロック対角フィッシャー近似の下では、期待される一般化誤差の計算可能な2次展開を導出し、それをエンコーダ分散、エンコーダバイアス、デコーダ分散に分解する。このバイアス分散分解は、共有深度とタスクウェイトを選択するための計算可能なプロキシを提供し、効率的なハイパーパラメータ探索を可能にする。標準長尾ベンチマークの実験では、強いベースラインに対する提案手法の有効性が示されている。

論文の概要: OSDTW: Optimal Shared Depth and Task Weighting for Long-Tailed Recognition

関連論文リスト