Related papers: MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis

MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis

URL: http://arxiv.org/abs/2202.04266v1
Date: Wed, 9 Feb 2022 04:12:30 GMT
Title: MMLN: Leveraging Domain Knowledge for Multimodal Diagnosis
Authors: Haodi Zhang, Chenyu Xu, Peirou Liang, Ke Duan, Hao Ren, Weibin Cheng, Kaishun Wu
Abstract summary: We propose a knowledge-driven and data-driven framework for lung disease diagnosis. We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data. A multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease.
Score: 10.133715767542386
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent studies show that deep learning models achieve good performance on medical imaging tasks such as diagnosis prediction. Among the models, multimodality has been an emerging trend, integrating different forms of data such as chest X-ray (CXR) images and electronic medical records (EMRs). However, most existing methods incorporate them in a model-free manner, which lacks theoretical support and ignores the intrinsic relations between different data sources. To address this problem, we propose a knowledge-driven and data-driven framework for lung disease diagnosis. By incorporating domain knowledge, machine learning models can reduce the dependence on labeled data and improve interpretability. We formulate diagnosis rules according to authoritative clinical medicine guidelines and learn the weights of rules from text data. Finally, a multimodal fusion consisting of text and image data is designed to infer the marginal probability of lung disease. We conduct experiments on a real-world dataset collected from a hospital. The results show that the proposed method outperforms the state-of-the-art multimodal baselines in terms of accuracy and interpretability.

Related papers

Multimodal Foundation Models for Early Disease Detection [0.0]
We present a foundation model that consolidates diverse patient data through an attention-based transformer framework.<n>The architecture is made for pretraining on many tasks, which makes it easy to adapt to new diseases and datasets with little extra work.<n>We provide an experimental strategy that uses benchmark datasets in oncology, cardiology, and neurology, with the goal of testing early detection tasks.
arXiv Detail & Related papers (2025-10-02T11:12:57Z)
ProbMed: A Probabilistic Framework for Medical Multimodal Binding [21.27709522688514]
We present Probabilistic Modality-Enhanced Diagnosis (ProbMED)<n>ProbMED aligns four distinct modalities--chest X-rays, electrocardiograms, echocardiograms--into a unified probabilistic embedding space.<n>Our model outperforms current medical vision-language pretraining models in cross-modality retrieval, zero-shot, and few-shot classification.
arXiv Detail & Related papers (2025-09-30T03:16:01Z)
RAD: Towards Trustworthy Retrieval-Augmented Multi-modal Clinical Diagnosis [56.373297358647655]
Retrieval-Augmented Diagnosis (RAD) is a novel framework that injects external knowledge into multimodal models directly on downstream tasks.<n>RAD operates through three key mechanisms: retrieval and refinement of disease-centered knowledge from multiple medical sources, a guideline-enhanced contrastive loss transformer, and a dual decoder.
arXiv Detail & Related papers (2025-09-24T10:36:14Z)
Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates. Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information. Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals. Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z)
Multimodal Medical Disease Classification with LLaMA II [0.14999444543328289]
We use the text-image pair dataset from OpenI consisting of 2D chest X-rays associated with clinical reports. Our focus is on fusion methods for merging text and vision information extracted from medical datasets. The newly introduced multimodal architecture can be applied to other multimodal datasets with little effort and can be easily adapted for further research.
arXiv Detail & Related papers (2024-12-02T09:18:07Z)
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models [49.765466293296186]
Recent progress in Medical Large Vision-Language Models (Med-LVLMs) has opened up new possibilities for interactive diagnostic tools. Med-LVLMs often suffer from factual hallucination, which can lead to incorrect diagnoses. We propose a versatile multimodal RAG system, MMed-RAG, designed to enhance the factuality of Med-LVLMs.
arXiv Detail & Related papers (2024-10-16T23:03:27Z)
MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer [13.74067035373274]
We introduce a multi-modal heterogeneous graph-based conditional feature-guided diffusion model for lymph node metastasis diagnosis based on CT images. We propose a masked relational representation learning strategy, aiming to uncover the latent prognostic correlations and priorities of primary tumor and lymph node image representations.
arXiv Detail & Related papers (2024-05-15T17:52:00Z)
HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive Modeling [4.44283662576491]
We present a novel framework based on hypernetworks to fuse clinical imaging and tabular data by conditioning the image processing on the EHR's values and measurements. We show that our framework outperforms both single-modality models and state-of-the-art MRI-tabular data fusion methods.
arXiv Detail & Related papers (2024-03-20T05:50:04Z)
HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture. We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA) HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z)
A Transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner. The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z)
Medical Diagnosis with Large Scale Multimodal Transformers: Leveraging Diverse Data for More Accurate Diagnosis [0.15776842283814416]
We present a new technical approach of "learnable synergies" Our approach is easily scalable and naturally adapts to multimodal data inputs from clinical routine. It outperforms state-of-the-art models in clinically relevant diagnosis tasks.
arXiv Detail & Related papers (2022-12-18T20:43:37Z)
Multi-modal Graph Learning for Disease Prediction [35.4310911850558]
We propose an end-to-end Multimodal Graph Learning framework (MMGL) for disease prediction. Instead of defining the adjacency matrix manually as existing methods, the latent graph structure can be captured through a novel way of adaptive graph learning.
arXiv Detail & Related papers (2021-07-01T03:59:22Z)
Variational Knowledge Distillation for Disease Classification in Chest X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays. We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance. For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming. In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z)
Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities. This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time. We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z)
Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification. It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.