Fugu-MT 論文翻訳(概要): Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex

論文の概要: Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex

arxiv url: http://arxiv.org/abs/2605.16468v1
Date: Fri, 15 May 2026 11:28:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:46.498376
Title: Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex
Title（参考訳）: 機械的に解釈可能なニューラルエンコーディングはヒト視覚皮質の微細な機能選択性を示す
Authors: Idan Daniel Grosbard, Mor Geva, Galit Yovel,
Abstract要約: 人間の視覚を理解するための中心的なゴールは、神経活動を引き起こす視覚的特徴を明らかにすることである。メカニカル・インタプリタブル・ニューラル(MINE)を導入し,ミリスケール(ボクセルレベル)の動作を駆動する特徴をローカライズする。 MINEは言語対応の画像表現を用いて各ボクセルの応答を予測し、ボクセルのアクティベーションに不可欠な特徴について意味論的に解釈可能な記述を生成する。
参考スコア（独自算出の注目度）: 23.760723597912776
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: A central goal in understanding human vision is to uncover the visual features that drive neuronal activity. A growing body of work has used artificial neural networks as encoding models to predict cortical responses to natural images, revealing the visual content that activates category-selective regions. However, existing approaches are largely correlational and treat the encoder as a black box, leaving open which image features drive each voxel's response. We introduce Mechanistically Interpretable Neural Encoding (MINE), a framework that opens this black box by applying mechanistic-interpretability tools to localize the features within natural images that drive millimeter-scale (voxel-level) activity. MINE predicts each voxel's response using language-aligned image representations, and produces semantically interpretable descriptions of the features critical for the voxel's activation. We further generalize these per-image features into per-voxel functional profiles. To validate the per-image descriptions, we show they are sufficient to generate images that elicit voxel responses matching the responses to the original images, more accurately than images generated from random or low-attribution controls. Moreover, counterfactually inserting or removing the predicted features from images shifts activation in the expected direction, providing causal evidence. Counterfactual editing guided by the per-voxel activation profiles produces even stronger activation shifts, indicating that the profiles faithfully capture each voxel's selectivity. Finally, we apply MINE to well-studied category-selective brain regions, showing it recovers their known categorical preferences while revealing fine-grained unique voxel structure within each region. Overall, our results establish mechanistic interpretability as a path to discover and causally validate fine-grained hypotheses about neural function.
Abstract（参考訳）: 人間の視覚を理解するための中心的なゴールは、神経活動を引き起こす視覚的特徴を明らかにすることである。成長する研究機関は、自然画像に対する皮質反応を予測するために、ニューラルネットワークを符号化モデルとして使用し、カテゴリー選択領域を活性化する視覚的内容を明らかにした。しかし、既存のアプローチは主に相関関係にあり、エンコーダをブラックボックスとして扱い、各ボクセルの反応を駆動する画像の特徴が開いている。メカニスティック・インタプリタブル・ニューラル・エンコーディング(MINE)は,このブラックボックスを開放するフレームワークであり,メカニスティック・インタプリタビリティ・ツールを用いて,ミリスケール(ボクセルレベル)のアクティビティを駆動する自然な画像内の特徴をローカライズする。 MINEは言語対応の画像表現を用いて各ボクセルの応答を予測し、ボクセルのアクティベーションに不可欠な特徴について意味論的に解釈可能な記述を生成する。さらに,これらの特徴をボクセルごとの機能プロファイルに一般化する。画像毎の記述を検証するためには、ランダムまたは低属性制御から生成された画像よりも正確に、元の画像に対する応答に一致するボクセル応答を誘発する画像を生成するのに十分であることを示す。さらに、画像からの予測特徴の挿入や削除は、期待方向のアクティベーションをシフトさせ、因果的証拠を提供する。ボクセルごとのアクティベーションプロファイルによって導かれる偽の編集は、さらに強力なアクティベーションシフトを生じさせ、プロファイルがそれぞれのボクセルの選択性を忠実に捉えていることを示す。最後に、MINEをよく研究されたカテゴリー選択脳領域に適用し、既知の分類的嗜好を回復し、各領域の微細な独自のボキセル構造を明らかにした。本研究の結果は, 神経機能に関する微細な仮説を発見し, 因果的に検証する手段として, 機械的解釈可能性を確立した。

論文の概要: Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex

関連論文リスト