Fugu-MT 論文翻訳(概要): EasyLens: A Training-Free Plug-and-Play Subtle-Lesion Representation Amplifier for Medical Vision-Language Models

論文の概要: EasyLens: A Training-Free Plug-and-Play Subtle-Lesion Representation Amplifier for Medical Vision-Language Models

arxiv url: http://arxiv.org/abs/2606.06379v1
Date: Thu, 04 Jun 2026 16:47:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-05 22:39:44.971264
Title: EasyLens: A Training-Free Plug-and-Play Subtle-Lesion Representation Amplifier for Medical Vision-Language Models
Title（参考訳）: EasyLens: 医用ビジョンランゲージモデルのためのトレーニング不要なプラグ・アンド・プレイ・サブ・ルール表現増幅器
Authors: Qiwei Zeng, Hao Wang, Jinghao Lin, Shuchang Ye, Yuezhe Yang, Yige Peng, Haoyuan Che, Jinman Kim, Lei Bi,
Abstract要約: 医用視覚言語モデル(VLM)のための訓練不要なプラグアンドプレイ微妙な表現増幅器 EasyLens を提案する。 EasyLensが最初に作ったEasyBankは、病理解剖学のプロトタイプスペースで、病変関連のプロトタイプと解剖学の通常の参照を提供する。正常な組織を盲目的に増幅するのを避けるため、EasyTagは反ファクトのプロトタイプ推論を通じて病変関連パッチを選択する。複数の医用画像データセットと凍結された医用VLMバックボーンの実験では、EasyLensは微妙な回帰検出を改善し、既存のエンコーダ・エンハンスメントベースラインを上回っている。
参考スコア（独自算出の注目度）: 10.799852886898927
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Medical vision-language models (VLMs) have shown increasing potential for clinical image interpretation, including lesion detection and report generation. However, their practical utility remains limited by insufficient sensitivity to subtle lesions, whose visual evidence is often sparse, low-contrast, and embedded within complex anatomical context. As local visual tokens are aggregated, these weak lesion cues can become underrepresented in global image representations, making them difficult for medical VLMs to recognize. Existing efforts to improve lesion sensitivity mainly rely on medical-domain vision-encoder pre-training, clinical-term-guided alignment, or trainable pathological representation enhancement. Although effective, these approaches usually require additional training or model-specific adaptation and may overfit to particular disease morphologies, limiting their applicability to frozen medical VLMs. To address these limitations, we propose EasyLens, a training-free plug-and-play subtle-lesion representation amplifier for medical VLMs. EasyLens first constructs EasyBank, a pathology-anatomy prototype space that provides lesion-related prototypes and anatomy-aware normal references for comparing suspicious patches against both pathological and normal anatomical patterns. To avoid blindly amplifying normal tissues, EasyTag selects lesion-relevant patches through counterfactual prototype reasoning. To counteract the dilution of subtle lesion cues in global image representations, EasyAmplifier strengthens the selected lesion-relevant patch representations through morphology-guided residual enhancement, thereby increasing their contribution to the global image embedding. Experiments on multiple medical image datasets and frozen medical VLM backbones show that EasyLens improves subtle-lesion detection and outperforms existing encoder-enhancement baselines.
Abstract（参考訳）: 医用視覚言語モデル(VLM)は、病変検出や報告生成を含む臨床画像解釈の可能性が高まっている。しかし、その実用性は微妙な病変に対する感度の不足によって制限されており、その視覚的証拠は、しばしばスパースで、低コントラストであり、複雑な解剖学的文脈に埋め込まれている。局所的な視覚トークンが集約されるにつれて、これらの弱い病変の手がかりはグローバルな画像表現では表現されにくくなり、医療用VLMが認識することが困難になる。病変の感度を高めるための既存の取り組みは、主に医療領域の視覚エンコーダの事前訓練、臨床期間のアライメント、または訓練可能な病理表現の強化に依存している。効果はあるが、これらのアプローチは通常、追加の訓練やモデル固有の適応を必要とし、特定の疾患形態に過度に適応し、凍結医療用VLMに適用性を制限する。これらの制約に対処するために,医療用VLMのためのトレーニング不要なプラグアンドプレイ微妙な表現増幅器 EasyLens を提案する。 EasyLensが最初に構築したEasyBankは、病理解剖学的および正常な解剖学的パターンの両方に対する疑わしいパッチを比較するために、病変関連プロトタイプと解剖学的正常参照を提供する、病理解剖学的プロトタイプスペースである。正常な組織を盲目的に増幅するのを避けるため、EasyTagは反ファクトのプロトタイプ推論を通じて病変関連パッチを選択する。大域的な画像表現における微妙な病変キューの希釈に対処するため、EasyAmplifierは形態誘導的残像強調を通じて選択された病変関連パッチ表現を強化し、大域的な画像埋め込みへの寄与を増大させる。複数の医用画像データセットと凍結された医用VLMバックボーンの実験では、EasyLensは微妙な回帰検出を改善し、既存のエンコーダ・エンハンスメントベースラインを上回っている。

論文の概要: EasyLens: A Training-Free Plug-and-Play Subtle-Lesion Representation Amplifier for Medical Vision-Language Models

関連論文リスト