Fugu-MT 論文翻訳(概要): DreamReader: An Interpretability Toolkit for Text-to-Image Models

論文の概要: DreamReader: An Interpretability Toolkit for Text-to-Image Models

arxiv url: http://arxiv.org/abs/2603.13299v1
Date: Mon, 02 Mar 2026 05:18:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:42.280623
Title: DreamReader: An Interpretability Toolkit for Text-to-Image Models
Title（参考訳）: DreamReader: テキスト-画像モデルのための解釈可能性ツールキット
Authors: Nirmalendu Prakash, Narmeen Oozeer, Michael Lan, Luka Samkharadze, Phillip Howard, Roy Ka-Wei Lee, Dhruv Nathawani, Shivam Raval, Amirali Abdullah,
Abstract要約: 我々はDreamReaderを紹介した。DreamReaderは、拡散解釈可能性を構成可能な表現演算子として形式化するフレームワークである。 DreamReaderは、拡散アーキテクチャを横断する体系的な分析と介入を可能にする、モデルに依存しない抽象化層を提供する。我々は, (i) 2つのモデル間のアクティベーションステッチを行う制御実験を通じてDreamReaderを実証し, (ii) 複数のアクティベーションユニットを操るためにLoReFTを適用した。
参考スコア（独自算出の注目度）: 11.153644326972511
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the rapid adoption of text-to-image (T2I) diffusion models, causal and representation-level analysis remains fragmented and largely limited to isolated probing techniques. To address this gap, we introduce DreamReader: a unified framework that formalizes diffusion interpretability as composable representation operators spanning activation extraction, causal patching, structured ablations, and activation steering across modules and timesteps. DreamReader provides a model-agnostic abstraction layer enabling systematic analysis and intervention across diffusion architectures. Beyond consolidating existing methods, DreamReader introduces three novel intervention primitives for diffusion models: (1) representation fine-tuning (LoReFT) for subspace-constrained internal adaptation; (2) classifier-guided gradient steering using MLP probes trained on activations; and (3) component-level cross-model mapping for systematic study of transferability of representations across modalities. These mechanisms allows us to do lightweight white-box interventions on T2I models by drawing inspiration from interpretability techniques on LLMs. We demonstrate DreamReader through controlled experiments that (i) perform activation stitching between two models, and (ii) apply LoReFT to steer multiple activation units, reliably injecting a target concept into the generated images. Experiments are specified declaratively and executed in controlled batched pipelines to enable reproducible large-scale analysis. Across multiple case studies, we show that techniques adapted from language model interpretability yield promising and controllable interventions in diffusion models. DreamReader is released as an open source toolkit for advancing research on T2I interpretability.
Abstract（参考訳）: テキスト・ツー・イメージ(T2I)拡散モデルが急速に採用されているにもかかわらず、因果解析と表現レベルの分析は断片化され、主に孤立した探索技術に限られている。このギャップに対処するために、DreamReaderを紹介します。DreamReaderは、モジュールとタイムステップをまたいだ、アクティベーション抽出、因果パッチング、構造化アブレーション、アクティベーションステアリングを対象とする、コンポーザブルな表現演算子として拡散解釈可能性を形式化する統合フレームワークです。 DreamReaderは、拡散アーキテクチャを横断する体系的な分析と介入を可能にする、モデルに依存しない抽象化層を提供する。既存の手法の統合に加えて,DreamReaderでは,(1)部分空間制約付き内部適応のための表現微調整(LoReFT),(2)アクティベーションに基づいて訓練されたMLPプローブを用いた分類器誘導勾配ステアリング,(3)モダリティ間の表現の系統的研究のためのコンポーネントレベルのクロスモデルマッピングという,拡散モデルの新たな介入プリミティブを導入している。これらのメカニズムにより、LLMの解釈可能性技術からインスピレーションを得て、T2Iモデルに軽量なホワイトボックス介入を行うことができる。制御された実験を通してDreamReaderを実証する (i)2つのモデルの活性化縫合を行い、 2)LoReFTを適用して複数のアクティベーションユニットを操り、生成した画像に目標概念を確実に注入する。実験は宣言的に指定され、再現可能な大規模な分析を可能にするために、制御されたバッチパイプラインで実行される。複数のケーススタディにおいて,言語モデルの解釈可能性に適応した手法が,拡散モデルにおける有望かつ制御可能な介入をもたらすことを示す。 DreamReaderは、T2I解釈可能性の研究を進めるためのオープンソースツールキットとしてリリースされた。

論文の概要: DreamReader: An Interpretability Toolkit for Text-to-Image Models

関連論文リスト