Fugu-MT 論文翻訳(概要): Enhancing Pre-trained Representation Classifiability can Boost its Interpretability

論文の概要: Enhancing Pre-trained Representation Classifiability can Boost its Interpretability

arxiv url: http://arxiv.org/abs/2510.24105v1
Date: Tue, 28 Oct 2025 06:21:06 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-29 15:35:36.812326
Title: Enhancing Pre-trained Representation Classifiability can Boost its Interpretability
Title（参考訳）: 事前訓練された表現のクラス化能力の強化は解釈可能性を高める
Authors: Shufan Shen, Zhaobo Qi, Junshu Sun, Qingming Huang, Qi Tian, Shuhui Wang,
Abstract要約: 表現内の解釈可能な意味の比率との相関を利用して、表現解釈可能性の定量化を行う。 Inherent Interpretability Score(IIS)を提案し、情報損失を評価し、解釈可能なセマンティクスの比率を測定し、表現解釈可能性の定量化を行う。
参考スコア（独自算出の注目度）: 112.296393156262
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The visual representation of a pre-trained model prioritizes the classifiability on downstream tasks, while the widespread applications for pre-trained visual models have posed new requirements for representation interpretability. However, it remains unclear whether the pre-trained representations can achieve high interpretability and classifiability simultaneously. To answer this question, we quantify the representation interpretability by leveraging its correlation with the ratio of interpretable semantics within the representations. Given the pre-trained representations, only the interpretable semantics can be captured by interpretations, whereas the uninterpretable part leads to information loss. Based on this fact, we propose the Inherent Interpretability Score (IIS) that evaluates the information loss, measures the ratio of interpretable semantics, and quantifies the representation interpretability. In the evaluation of the representation interpretability with different classifiability, we surprisingly discover that the interpretability and classifiability are positively correlated, i.e., representations with higher classifiability provide more interpretable semantics that can be captured in the interpretations. This observation further supports two benefits to the pre-trained representations. First, the classifiability of representations can be further improved by fine-tuning with interpretability maximization. Second, with the classifiability improvement for the representations, we obtain predictions based on their interpretations with less accuracy degradation. The discovered positive correlation and corresponding applications show that practitioners can unify the improvements in interpretability and classifiability for pre-trained vision models. Codes are available at https://github.com/ssfgunner/IIS.
Abstract（参考訳）: 事前学習されたモデルの視覚的表現は、下流タスクにおける分類可能性の優先順位を付け、事前学習された視覚モデルに対する広範な応用は、表現の解釈可能性に対する新しい要件を提示している。しかし、事前学習された表現が高い解釈可能性と分類可能性を同時に達成できるかどうかは不明である。この疑問に答えるために、表現内の解釈可能な意味の比率との相関を利用して、表現解釈可能性の定量化を行う。事前訓練された表現が与えられた場合、解釈可能な意味論のみが解釈によってキャプチャされるが、解釈できない部分は情報損失につながる。この事実に基づいて、情報損失を評価し、解釈可能なセマンティクスの比率を測定し、表現解釈可能性の定量化を行うIIS(Inherent Interpretability Score)を提案する。異なる分類可能性を持つ表現解釈可能性の評価において、解釈可能性と分類可能性が正の相関関係にあること、すなわち、高い分類可能性を持つ表現が解釈においてより解釈可能な意味論を提供することを発見した。この観察は、事前訓練された表現に対する2つの利点をさらに支持する。まず、解釈可能性の最大化を伴う微調整により、表現のクラス化可能性をさらに改善することができる。第二に、表現の分類可能性の向上により、より精度の低い解釈に基づく予測が得られる。検出された肯定的相関とそれに対応する応用は、事前学習された視覚モデルに対する解釈可能性と分類可能性の改善を統一できることを示す。コードはhttps://github.com/ssfgunner/IIS.comで入手できる。

論文の概要: Enhancing Pre-trained Representation Classifiability can Boost its Interpretability

関連論文リスト