Related papers: A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications

A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications

URL: http://arxiv.org/abs/2408.02686v1
Date: Fri, 2 Aug 2024 11:48:04 GMT
Title: A Systematic Review of Intermediate Fusion in Multimodal Deep Learning for Biomedical Applications
Authors: Valerio Guarrasi, Fatih Aksu, Camillo Maria Caruso, Francesco Di Feola, Aurora Rofena, Filippo Ruffini, Paolo Soda,
Abstract summary: This systematic review aims to analyze and formalize current intermediate fusion methods in biomedical applications. We introduce a structured notation to enhance the understanding and application of these methods beyond the biomedical domain. Our findings are intended to support researchers, healthcare professionals, and the broader deep learning community in developing more sophisticated and insightful multimodal models.
Score: 0.7831774233149619
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Deep learning has revolutionized biomedical research by providing sophisticated methods to handle complex, high-dimensional data. Multimodal deep learning (MDL) further enhances this capability by integrating diverse data types such as imaging, textual data, and genetic information, leading to more robust and accurate predictive models. In MDL, differently from early and late fusion methods, intermediate fusion stands out for its ability to effectively combine modality-specific features during the learning process. This systematic review aims to comprehensively analyze and formalize current intermediate fusion methods in biomedical applications. We investigate the techniques employed, the challenges faced, and potential future directions for advancing intermediate fusion methods. Additionally, we introduce a structured notation to enhance the understanding and application of these methods beyond the biomedical domain. Our findings are intended to support researchers, healthcare professionals, and the broader deep learning community in developing more sophisticated and insightful multimodal models. Through this review, we aim to provide a foundational framework for future research and practical applications in the dynamic field of MDL.

Related papers

From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer Research [17.42746456656653]
Foundations models (FMs) offer new avenues for discovering biomarkers, improving diagnosis, and personalizing treatment.<n>We examine emerging trends in machine learning (ML) and deep learning (DL)<n>We identify the state-of-the-art FMs, publicly available multi-modal repositories, and advanced tools and methods for data integration.
arXiv Detail & Related papers (2025-07-11T21:23:21Z)
PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models [59.17570021208177]
PyTDC is a machine-learning platform providing streamlined training, evaluation, and inference software for multimodal biological AI models.<n>This paper discusses the components of PyTDC's architecture and, to our knowledge, the first-of-its-kind case study on the introduced single-cell drug-target nomination ML task.
arXiv Detail & Related papers (2025-05-08T18:15:38Z)
Towards Artificial Intelligence Research Assistant for Expert-Involved Learning [64.7438151207189]
Large Language Models (LLMs) and Large Multi-Modal Models (LMMs) have emerged as transformative tools in scientific research.<n>We present textbfARtificial textbfIntelligence research assistant for textbfExpert-involved textbfLearning (ARIEL)
arXiv Detail & Related papers (2025-05-03T14:21:48Z)
Biomedical Foundation Model: A Survey [84.26268124754792]
Foundation models are large-scale pre-trained models that learn from extensive unlabeled datasets. These models can be adapted to various applications such as question answering and visual understanding. This survey explores the potential of foundation models across diverse domains within biomedical fields.
arXiv Detail & Related papers (2025-03-03T22:42:00Z)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM [51.91311158085973]
Methods for detecting AI-generated media have evolved rapidly. General-purpose detectors based on MLLMs integrate authenticity verification, explainability, and localization capabilities. Ethical and security considerations have emerged as critical global concerns.
arXiv Detail & Related papers (2025-02-07T12:18:20Z)
Explainable AI Methods for Multi-Omics Analysis: A Survey [3.885941688264509]
Multi-omics refers to the integrative analysis of data derived from multiple 'omes' Deep learning methods are increasingly utilized to integrate multi-omics data, offering insights into molecular interactions and enhancing research into complex diseases. These models, with their numerous interconnected layers and nonlinear relationships, often function as black boxes, lacking transparency in decision-making processes. This review explores how xAI can improve the interpretability of deep learning models in multi-omics research, highlighting its potential to provide clinicians with clear insights.
arXiv Detail & Related papers (2024-10-15T05:01:17Z)
Automated Ensemble Multimodal Machine Learning for Healthcare [52.500923923797835]
We introduce a multimodal framework, AutoPrognosis-M, that enables the integration of structured clinical (tabular) data and medical imaging using automated machine learning. AutoPrognosis-M incorporates 17 imaging models, including convolutional neural networks and vision transformers, and three distinct multimodal fusion strategies.
arXiv Detail & Related papers (2024-07-25T17:46:38Z)
A review of deep learning-based information fusion techniques for multimodal medical image classification [1.996181818659251]
Deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks.
arXiv Detail & Related papers (2024-04-23T13:31:18Z)
Machine Learning Techniques for MRI Data Processing at Expanding Scale [0.5221459608786241]
Imaging sites around the world generate growing amounts of medical scan data with ever more versatile and affordable technology. These large datasets encode substantial information about human health and hold considerable potential for machine learning training and analysis. This chapter examines ongoing large-scale studies and the challenge of distribution shifts between them.
arXiv Detail & Related papers (2024-04-22T16:38:41Z)
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications. Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders. We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z)
OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine [55.29668193415034]
We present OpenMEDLab, an open-source platform for multi-modality foundation models. It encapsulates solutions of pioneering attempts in prompting and fine-tuning large language and vision models for frontline clinical and bioinformatic applications. It opens access to a group of pre-trained foundation models for various medical image modalities, clinical text, protein engineering, etc.
arXiv Detail & Related papers (2024-02-28T03:51:02Z)
HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data [10.774128925670183]
This paper presents the Hybrid Early-fusion Attention Learning Network (HEALNet), a flexible multimodal fusion architecture. We conduct multimodal survival analysis on Whole Slide Images and Multi-omic data on four cancer datasets from The Cancer Genome Atlas (TCGA) HEALNet achieves state-of-the-art performance compared to other end-to-end trained fusion models.
arXiv Detail & Related papers (2023-11-15T17:06:26Z)
Multimodal Machine Learning in Image-Based and Clinical Biomedicine: Survey and Prospects [2.1070612998322438]
The paper explores the transformative potential of multimodal models for clinical predictions. Despite advancements, challenges such as data biases and the scarcity of "big data" in many biomedical domains persist.
arXiv Detail & Related papers (2023-11-04T05:42:51Z)
Incomplete Multimodal Learning for Complex Brain Disorders Prediction [65.95783479249745]
We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks. We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative cohort.
arXiv Detail & Related papers (2023-05-25T16:29:16Z)
DIME: Fine-grained Interpretations of Multimodal Models via Disentangled Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models. Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z)
Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering. The main challenges that can be formulated as ML problems are classified into the three main categories. For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.