Fugu-MT 論文翻訳(概要): Robust Incomplete-Modality Alignment for Ophthalmic Disease Grading and Diagnosis via Labeled Optimal Transport

論文の概要: Robust Incomplete-Modality Alignment for Ophthalmic Disease Grading and Diagnosis via Labeled Optimal Transport

arxiv url: http://arxiv.org/abs/2507.04999v1
Date: Mon, 07 Jul 2025 13:36:39 GMT
ステータス: 翻訳完了
システム内更新日: 2025-07-08 15:46:35.439184
Title: Robust Incomplete-Modality Alignment for Ophthalmic Disease Grading and Diagnosis via Labeled Optimal Transport
Title（参考訳）: ラベリングオプティカルトランスポートによる眼疾患の描出と診断のためのロバスト不完全モダリティアライメント
Authors: Qinkai Yu, Jianyang Xie, Yitian Zhao, Cheng Chen, Lijun Zhang, Liming Chen, Jun Cheng, Lu Liu, Yalin Zheng, Yanda Meng,
Abstract要約: 眼底画像と光コヒーレンス断層撮影(OCT)を併用したマルチモーダル眼底画像診断を行った。既存の一般的なパイプライン、例えばモダリティ計算や蒸留法は、顕著な制限に直面している。本稿では,眼科診断の課題において欠落したモダリティを頑健に扱える新しい多モードアライメントと融合フレームワークを提案する。
参考スコア（独自算出の注目度）: 28.96009174108652
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multimodal ophthalmic imaging-based diagnosis integrates color fundus image with optical coherence tomography (OCT) to provide a comprehensive view of ocular pathologies. However, the uneven global distribution of healthcare resources often results in real-world clinical scenarios encountering incomplete multimodal data, which significantly compromises diagnostic accuracy. Existing commonly used pipelines, such as modality imputation and distillation methods, face notable limitations: 1)Imputation methods struggle with accurately reconstructing key lesion features, since OCT lesions are localized, while fundus images vary in style. 2)distillation methods rely heavily on fully paired multimodal training data. To address these challenges, we propose a novel multimodal alignment and fusion framework capable of robustly handling missing modalities in the task of ophthalmic diagnostics. By considering the distinctive feature characteristics of OCT and fundus images, we emphasize the alignment of semantic features within the same category and explicitly learn soft matching between modalities, allowing the missing modality to utilize existing modality information, achieving robust cross-modal feature alignment under the missing modality. Specifically, we leverage the Optimal Transport for multi-scale modality feature alignment: class-wise alignment through predicted class prototypes and feature-wise alignment via cross-modal shared feature transport. Furthermore, we propose an asymmetric fusion strategy that effectively exploits the distinct characteristics of OCT and fundus modalities. Extensive evaluations on three large ophthalmic multimodal datasets demonstrate our model's superior performance under various modality-incomplete scenarios, achieving Sota performance in both complete modality and inter-modality incompleteness conditions. Code is available at https://github.com/Qinkaiyu/RIMA
Abstract（参考訳）: 光コヒーレンス・トモグラフィー(OCT)を併用した眼底画像のマルチモーダル画像診断により,眼疾患の包括的観察が可能となった。しかし、不均一な医療資源のグローバル分布は、診断精度を著しく損なう不完全なマルチモーダルデータに遭遇する現実的な臨床シナリオをもたらすことが多い。 1) OCTの病変は局所化されているが, 基底像の形状は様々である。 2) 蒸留法は, 完全に組み合わせたマルチモーダルトレーニングデータに大きく依存する。これらの課題に対処するために、眼科診断の課題において、欠落したモダリティを頑健に扱える新しいマルチモーダルアライメントと融合フレームワークを提案する。 OCTと眼底画像の特徴を考慮し、同一カテゴリ内の意味的特徴のアライメントを強調し、モダリティ間のソフトマッチングを明示的に学習し、欠落したモダリティが既存のモダリティ情報を利用するようにし、欠落したモダリティの下で堅牢なクロスモーダル特徴アライメントを実現する。具体的には、予測されたクラスプロトタイプによるクラスワイドアライメントと、クロスモーダル共有機能トランスポートによる機能ワイドアライメントという、マルチスケールのモダリティ機能アライメントに最適なトランスポートを利用する。さらに,OCTとFundusの相違点を効果的に活用する非対称核融合戦略を提案する。 3つの大きな眼科マルチモーダルデータセットの広範囲な評価は、様々なモダリティ-不完全シナリオ下でのモデルの優れた性能を示し、完全なモダリティとモダリティ間不完全性条件の両方においてソタのパフォーマンスを達成する。コードはhttps://github.com/Qinkaiyu/RIMAで入手できる。

関連論文リスト

Unsupervised Multimodal 3D Medical Image Registration with Multilevel Correlation Balanced Optimization [22.633633605566214]
マルチレベル相関バランス最適化に基づく教師なしマルチモーダル医用画像登録手法を提案する。異なるモードの術前医療画像に対して、変形場間の最大融合により有効な情報のアライメントと積み重ねを実現する。
論文参考訳（メタデータ） (2024-09-08T09:38:59Z)
Robust Semi-supervised Multimodal Medical Image Segmentation via Cross Modality Collaboration [21.97457095780378]
本稿では,ラベル付きデータの不足やモダリティの不一致に頑健な,新しい半教師付きマルチモーダルセグメンテーションフレームワークを提案する。本フレームワークでは,各モダリティに固有の,モダリティに依存しない知識を蒸留する,新たなモダリティ協調戦略を採用している。また、対照的な一貫した学習を統合して解剖学的構造を規制し、ラベルのないデータに対する解剖学的予測アライメントを容易にする。
論文参考訳（メタデータ） (2024-08-14T07:34:12Z)
ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading [7.188153974946432]
緑内障は視覚障害の主要な原因の1つである。医用画像の類似度が高いことと、不均衡なマルチモーダルデータ分布のため、信頼性の高い特徴を抽出することは依然として困難である。コントラストのある特徴抽出段階と決定レベルの融合段階からなる新しいフレームワークであるETSCLを提案する。
論文参考訳（メタデータ） (2024-07-19T11:57:56Z)
GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation [68.63955715643974]
Omnimodal Learning(GTP-4o)のためのモダリティプロンプト不均質グラフ我々は、Omnimodal Learning(GTP-4o)のための革新的モダリティプロンプト不均質グラフを提案する。
論文参考訳（メタデータ） (2024-07-08T01:06:13Z)
Confidence-aware multi-modality learning for eye disease screening [58.861421804458395]
眼疾患スクリーニングのための新しい多モード顕在核融合パイプラインを提案する。モダリティごとに信頼度を測り、マルチモダリティ情報をエレガントに統合する。パブリックデータセットと内部データセットの両方の実験結果は、我々のモデルが堅牢性に優れていることを示している。
論文参考訳（メタデータ） (2024-05-28T13:27:30Z)
Simultaneous Tri-Modal Medical Image Fusion and Super-Resolution using Conditional Diffusion Model [2.507050016527729]
トリモーダル医療画像融合は、病気の形状、位置、生物学的活動をより包括的に見ることができる。画像装置の限界や患者の安全への配慮により、医療画像の品質は制限されることが多い。画像の解像度を向上し、マルチモーダル情報を統合できる技術が緊急に必要である。
論文参考訳（メタデータ） (2024-04-26T12:13:41Z)
Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning [65.54680361074882]
アイゲイズガイドマルチモーダルアライメント(EGMA)フレームワークは、アイゲイズデータを利用して、医用視覚的特徴とテキスト的特徴のアライメントを改善する。我々は4つの医療データセット上で画像分類と画像テキスト検索の下流タスクを行う。
論文参考訳（メタデータ） (2024-03-19T03:59:14Z)
Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training [55.56609500764344]
本稿では,マルチタスク・ペアド・マスキング・アライメント(MPMA)に基づく統合フレームワークを提案する。また, メモリ拡張クロスモーダルフュージョン (MA-CMF) モジュールを導入し, 視覚情報を完全統合し, レポート再構築を支援する。
論文参考訳（メタデータ） (2023-05-13T13:53:48Z)
Robust Multimodal Brain Tumor Segmentation via Feature Disentanglement and Gated Fusion [71.87627318863612]
画像モダリティの欠如に頑健な新しいマルチモーダルセグメンテーションフレームワークを提案する。我々のネットワークは、入力モードをモダリティ固有の外観コードに分解するために、特徴不整合を用いる。我々は,BRATSチャレンジデータセットを用いて,重要なマルチモーダル脳腫瘍セグメンテーション課題に対する本手法の有効性を検証した。
論文参考訳（メタデータ） (2020-02-22T14:32:04Z)
Hi-Net: Hybrid-fusion Network for Multi-modal MR Image Synthesis [143.55901940771568]
マルチモーダルMR画像合成のためのHybrid-fusion Network(Hi-Net)を提案する。当社のHi-Netでは,各モーダリティの表現を学習するために,モーダリティ特化ネットワークを用いている。マルチモーダル合成ネットワークは、潜在表現と各モーダルの階層的特徴を密結合するように設計されている。
論文参考訳（メタデータ） (2020-02-11T08:26:42Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。