Fugu-MT 論文翻訳(概要): The Standard Interpretable Model: A general theory of interpretable machine learning to deductively design interpretable methods using Lagrangian mechanics

論文の概要: The Standard Interpretable Model: A general theory of interpretable machine learning to deductively design interpretable methods using Lagrangian mechanics

arxiv url: http://arxiv.org/abs/2606.12289v1
Date: Wed, 10 Jun 2026 16:26:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-11 16:42:38.562741
Title: The Standard Interpretable Model: A general theory of interpretable machine learning to deductively design interpretable methods using Lagrangian mechanics
Title（参考訳）: 標準解釈モデル:ラグランジアン力学を用いた解法設計のための解釈可能な機械学習の一般理論
Authors: Pietro Barbiero, Giovanni De Felice, Mateo Espinosa Zarlenga, Francesco Giannini, Filippo Bonchi, Mateja Jamnik, Giuseppe Marra, Ruggero Noris,
Abstract要約: ラグランジュ力学を基礎とした標準解釈モデル(SIM)を導入し,解釈可能な手法の導出設計を可能にする。 SIMが既存の手法の限界を特定し,解決することを実証的に示す。 SIMは、解釈可能性カリキュラムの教育的基盤を提供し、長い間断片化されてきた分野に対する科学コミュニティの視点を変える可能性がある。
参考スコア（独自算出の注目度）: 37.30653132093503
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As Artificial Intelligence models grow in complexity, interpretability has become an indispensable tool for understanding, debugging, and controlling their computations. However, interpretability lacks general theories to deductively design interpretable methods. This gap between theories and methods results in a fragmented literature and inconsistent evaluation protocols. To fill this gap, we introduce the Standard Interpretable Model (SIM), a general theory grounded in Lagrangian mechanics that enables the deductive design of interpretable methods. Specifically, the SIM summarises, in a set of premises, what interpretability is for a target user. From these premises, the SIM systematically derives interpretability symmetries and corresponding constraints, which shape the landscape of a Lagrangian whose minima correspond to optimal interpretable models. To reach the minima, one can either update the parameter values of an opaque model to make it more interpretable or compile constraints into an interpretable architecture. We empirically show that the SIM identifies and solves limitations of existing methods (including traditional, concept-based, and mechanistic interpretability), highlights underexplored research directions, and informs the design of core programming interfaces. Beyond being a research method, the deductive nature of the SIM offers pedagogical grounding for interpretability curricula and may shift the scientific community's perspective of a discipline that has long been fragmented.
Abstract（参考訳）: 人工知能モデルが複雑さを増すにつれ、解釈可能性(interpretability)は、その計算を理解し、デバッグし、制御するための欠かせないツールになっている。しかし、解釈可能性には解釈可能な手法を導出的に設計する一般的な理論が欠けている。この理論と方法のギャップは、断片化された文献と一貫性のない評価プロトコルをもたらす。このギャップを埋めるために,ラグランジュ力学を基礎とした一般理論である標準解釈モデル(SIM)を導入し,解釈可能な手法の導出設計を可能にする。具体的には、SIMは、一連の前提において、ターゲットユーザにとっての解釈可能性について要約する。これらの前提から、SIMは解釈可能性対称性とそれに対応する制約を体系的に導き出し、最小値が最適解釈可能なモデルに対応するラグランジアンの風景を形作る。ミニマに到達するために、不透明なモデルのパラメータ値を更新して、より解釈可能なアーキテクチャに制約をコンパイルすることができる。 SIMは既存の手法(伝統的,概念ベース,機械的解釈可能性を含む)の限界を特定し,解決することを実証的に示し,探索されていない研究の方向性を強調し,コアプログラミングインタフェースの設計を通知する。 SIMの誘引的な性質は、研究方法の他に、解釈可能性のカリキュラムの教育的基盤を提供し、長い間断片化されてきた分野に対する科学コミュニティの視点をシフトさせる可能性がある。

論文の概要: The Standard Interpretable Model: A general theory of interpretable machine learning to deductively design interpretable methods using Lagrangian mechanics

関連論文リスト