Fugu-MT 論文翻訳(概要): Learning Task-Agnostic Representations through Multi-Teacher Distillation

論文の概要: Learning Task-Agnostic Representations through Multi-Teacher Distillation

arxiv url: http://arxiv.org/abs/2510.18680v1
Date: Tue, 21 Oct 2025 14:36:33 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:13.73537
Title: Learning Task-Agnostic Representations through Multi-Teacher Distillation
Title（参考訳）: 多教師蒸留によるタスク非依存表現の学習
Authors: Philippe Formont, Maxime Darrin, Banafsheh Karimian, Jackie CK Cheung, Eric Granger, Ismail Ben Ayed, Mohammadhadi Shateri, Pablo Piantanida,
Abstract要約: 本稿では,「多数決」目的関数に基づくタスク非依存フレームワークを提案する。この機能は,学生と教師の埋め込みの相互情報に縛られていることを実証する。提案手法は,教師の多様性を効果的に活用し,多様な下流タスクのパフォーマンス向上を実現する。
参考スコア（独自算出の注目度）: 59.488314181423284
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Casting complex inputs into tractable representations is a critical step across various fields. Diverse embedding models emerge from differences in architectures, loss functions, input modalities and datasets, each capturing unique aspects of the input. Multi-teacher distillation leverages this diversity to enrich representations but often remains tailored to specific tasks. In this paper, we introduce a task-agnostic framework based on a ``majority vote" objective function. We demonstrate that this function is bounded by the mutual information between student and teachers' embeddings, leading to a task-agnostic distillation loss that eliminates dependence on task-specific labels or prior knowledge. Our evaluations across text, vision models, and molecular modeling show that our method effectively leverages teacher diversity, resulting in representations enabling better performance for a wide range of downstream tasks such as classification, clustering, or regression. Additionally, we train and release state-of-the-art embedding models, enhancing downstream performance in various modalities.
Abstract（参考訳）: 複雑な入力を抽出可能な表現にキャストすることは、様々な分野において重要なステップである。様々な埋め込みモデルは、アーキテクチャ、損失関数、入力モダリティ、データセットの違いから生まれ、それぞれが入力のユニークな側面をキャプチャする。マルチティーチンガー蒸留は、この多様性を利用して表現を豊かにするが、しばしば特定のタスクに合わせたままである。本稿では,「多数投票」目的関数に基づくタスク非依存フレームワークを提案する。この機能は,学生と教師の埋め込みの相互情報に縛られ,タスク固有のラベルや事前知識への依存を排除した,タスク非依存の蒸留損失につながることを実証する。テキスト, 視覚モデル, 分子モデルを用いて評価した結果, 教師の多様性を効果的に活用し, 分類, クラスタリング, 回帰といった幅広い下流タスクにおいて, より優れたパフォーマンスを実現することが示唆された。さらに、我々は最先端の埋め込みモデルを訓練し、リリースし、様々なモードで下流の性能を向上させる。

論文の概要: Learning Task-Agnostic Representations through Multi-Teacher Distillation

関連論文リスト