Related papers: Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification

Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification

URL: http://arxiv.org/abs/2405.19363v1
Date: Fri, 24 May 2024 16:51:10 GMT
Title: Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification
Authors: Yihe Wang, Nan Huang, Taida Li, Yujun Yan, Xiang Zhang,
Abstract summary: We introduce Medformer, a multi-granularity patching transformer tailored specifically for medical time series classification. Our method incorporates three novel mechanisms to leverage the unique characteristics of medical time series.
Score: 6.0233642055651115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Medical time series data, such as Electroencephalography (EEG) and Electrocardiography (ECG), play a crucial role in healthcare, such as diagnosing brain and heart diseases. Existing methods for medical time series classification primarily rely on handcrafted biomarkers extraction and CNN-based models, with limited exploration of transformers tailored for medical time series. In this paper, we introduce Medformer, a multi-granularity patching transformer tailored specifically for medical time series classification. Our method incorporates three novel mechanisms to leverage the unique characteristics of medical time series: cross-channel patching to leverage inter-channel correlations, multi-granularity embedding for capturing features at different scales, and two-stage (intra- and inter-granularity) multi-granularity self-attention for learning features and correlations within and among granularities. We conduct extensive experiments on five public datasets under both subject-dependent and challenging subject-independent setups. Results demonstrate Medformer's superiority over 10 baselines, achieving top averaged ranking across five datasets on all six evaluation metrics. These findings underscore the significant impact of our method on healthcare applications, such as diagnosing Myocardial Infarction, Alzheimer's, and Parkinson's disease. We release the source code at \url{https://github.com/DL4mHealth/Medformer}.

Related papers

MedGemma Technical Report [75.88152277443179]
We introduce MedGemma, a collection of medical vision-language foundation models based on Gemma 3 4B and 27B.<n>MedGemma demonstrates advanced medical understanding and reasoning on images and text.<n>We additionally introduce MedSigLIP, a medically-tuned vision encoder derived from SigLIP.
arXiv Detail & Related papers (2025-07-07T17:01:44Z)
Meta-Entity Driven Triplet Mining for Aligning Medical Vision-Language Models [9.76070837929117]
Existing alignment methods prioritize separation between disease classes over segregation of fine-grained pathology attributes. Here, we propose MedTrim, a novel method that enhances image-text alignment through multimodal triplet learning. Our demonstrations indicate that MedTrim improves performance in downstream retrieval and classification tasks compared to state-of-the-art alignment methods.
arXiv Detail & Related papers (2025-04-22T14:17:51Z)
Sparseformer: a Transferable Transformer with Multi-granularity Token Sparsification for Medical Time Series Classification [25.47662257105448]
We introduce Sparseformer, a transformer specifically designed for MedTS classification. We propose a sparse token-based dual-attention mechanism that enables global modeling and token compression. Our model outperforms 12 baselines across seven medical datasets under supervised learning.
arXiv Detail & Related papers (2025-03-19T13:22:42Z)
MedCoDi-M: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation [22.908801443059758]
We present MedCoDi-M, a model for multimodal medical data generation. We benchmark it against five competitors on the MIMIC-CXR dataset. We assess the utility of MedCoDi-M in addressing key challenges in the medical field.
arXiv Detail & Related papers (2025-01-08T16:53:56Z)
UNICORN: A Deep Learning Model for Integrating Multi-Stain Data in Histopathology [2.9389205138207277]
UNICORN is a multi-modal transformer capable of processing multi-stain histopathology for atherosclerosis severity class prediction. The architecture comprises a two-stage, end-to-end trainable model with specialized modules utilizing transformer self-attention blocks. UNICORN achieved a classification accuracy of 0.67, outperforming other state-of-the-art models.
arXiv Detail & Related papers (2024-09-26T12:13:52Z)
GFE-Mamba: Mamba-based AD Multi-modal Progression Assessment via Generative Feature Extraction from MCI [5.355943545567233]
Alzheimer's Disease (AD) is an irreversible neurodegenerative disorder that often progresses from Mild Cognitive Impairment (MCI) We introduce GFE-Mamba, a classifier based on Generative Feature Extraction (GFE) It integrates data from assessment scales, MRI, and PET, enabling deeper multimodal fusion. Our experimental results demonstrate that the GFE-Mamba model is effective in predicting the conversion from MCI to AD.
arXiv Detail & Related papers (2024-07-22T15:22:33Z)
FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical Imaging [68.6715007665896]
FedMedICL is a unified framework and benchmark to holistically evaluate federated medical imaging challenges. We comprehensively evaluate several popular methods on six diverse medical imaging datasets. We find that a simple batch balancing technique surpasses advanced methods in average performance across FedMedICL experiments.
arXiv Detail & Related papers (2024-07-11T19:12:23Z)
Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision. This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z)
HyperFusion: A Hypernetwork Approach to Multimodal Integration of Tabular and Medical Imaging Data for Predictive Modeling [4.44283662576491]
We present a novel framework based on hypernetworks to fuse clinical imaging and tabular data by conditioning the image processing on the EHR's values and measurements. We show that our framework outperforms both single-modality models and state-of-the-art MRI-tabular data fusion methods.
arXiv Detail & Related papers (2024-03-20T05:50:04Z)
Graph-Ensemble Learning Model for Multi-label Skin Lesion Classification using Dermoscopy and Clinical Images [7.159532626507458]
This study introduces a Graph Convolution Network (GCN) to exploit prior co-occurrence between each category as a correlation matrix into the deep learning model for the multi-label classification. We propose a Graph-Ensemble Learning Model (GELN) that views the prediction from GCN as complementary information of the predictions from the fusion model.
arXiv Detail & Related papers (2023-07-04T13:19:57Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2. We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z)
Understanding the Tricks of Deep Learning in Medical Image Segmentation: Challenges and Future Directions [66.40971096248946]
In this paper, we collect a series of MedISeg tricks for different model implementation phases. We experimentally explore the effectiveness of these tricks on consistent baselines. We also open-sourced a strong MedISeg repository, where each component has the advantage of plug-and-play.
arXiv Detail & Related papers (2022-09-21T12:30:05Z)
G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers. We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.