ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models
- URL: http://arxiv.org/abs/2509.00102v3
- Date: Fri, 24 Oct 2025 13:01:10 GMT
- Title: ECG-Soup: Harnessing Multi-Layer Synergy for ECG Foundation Models
- Authors: Phu X. Nguyen, Huy Phan, Hieu Pham, Christos Chatzichristos, Bert Vandenberk, Maarten De Vos,
- Abstract summary: Transformer-based foundation models for Electrocardiograms (ECGs) have recently achieved impressive performance in many downstream applications.<n>ECGs are used in the diagnosis and treatment of heart disease.
- Score: 17.400439953606913
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformer-based foundation models for Electrocardiograms (ECGs) have recently achieved impressive performance in many downstream applications.
Related papers
- ECG-MoE: Mixture-of-Expert Electrocardiogram Foundation Model [22.753790262338185]
ECG-MoE is a hybrid architecture that integrates multi-model temporal features with a cardiac period-aware expert module.<n>It achieves state-of-the-art performance with 40% faster inference than multi-task baselines.
arXiv Detail & Related papers (2026-03-04T20:36:05Z) - EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model [46.84040404474695]
EnECG is an ensemble-based framework that integrates multiple specialized foundation models, each excelling in different aspects of ECG interpretation.<n>We show that EnECG can help reduce computational and memory costs while maintaining the strong representational power of foundation models.<n>This framework not only enhances feature extraction and predictive performance but also ensures practical efficiency for real-world clinical applications.
arXiv Detail & Related papers (2025-11-28T07:22:33Z) - Simulator and Experience Enhanced Diffusion Model for Comprehensive ECG Generation [52.19347532840774]
We propose SE-Diff, a novel physiological simulator and experience enhanced diffusion model for ECG generation.<n> SE-Diff integrates a lightweight ordinary differential equation (ODE)-based ECG simulator into the diffusion process via a beat decoder.<n>Extensive experiments on real-world ECG datasets demonstrate that SE-Diff improves both signal fidelity and text-ECG semantic alignment.
arXiv Detail & Related papers (2025-11-13T02:57:10Z) - UniECG: Understanding and Generating ECG in One Unified Model [26.641666246045133]
We propose UniECG, the first unified model for ECG capable of concurrently performing evidence-based ECG interpretation and text-conditioned ECG generation tasks.<n>UniECG can autonomously choose to interpret or generate an ECG based on user input, significantly extending the capability boundaries of current ECG models.
arXiv Detail & Related papers (2025-09-23T03:15:53Z) - BenchECG and xECG: a benchmark and baseline for ECG foundation models [0.0]
Electrocardiograms (ECGs) are inexpensive, widely used, and well-suited to deep learning.<n>We introduce BenchECG, a standardised benchmark comprising a comprehensive suite of publicly available ECG datasets and versatile tasks.<n>We also propose xECG, an xLSTM-based recurrent model trained with SimDINOv2 self-supervised learning, which achieves the best BenchECG score compared to publicly available state-of-the-art models.
arXiv Detail & Related papers (2025-09-12T11:27:17Z) - Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling [50.58126509704037]
Heartcare Suite is a framework for fine-grained electrocardiogram (ECG) understanding.<n>Heartcare-220K is a high-quality, structured, and comprehensive multimodal ECG dataset.<n>Heartcare-Bench is a benchmark to guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios.
arXiv Detail & Related papers (2025-06-06T07:56:41Z) - anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding [20.290531515033518]
multimodal large language models (MLLMs) have sparked interest in their application to electrocardiogram (ECG) analysis.<n>Existing ECG-focused MLLMs primarily focus on report generation tasks, often limited to single 12-lead, short-duration (10s) ECG inputs.<n>We propose the anyECG-chat model, which supports dynamic-length ECG inputs and multiple ECG inputs.
arXiv Detail & Related papers (2025-06-01T10:17:13Z) - TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation [41.909091496502704]
Diffusion Transformers (DiTs) are a powerful yet underexplored class of generative models.<n>We propose TIDE-Temporal-aware sparse autoencoders for Interpretable Diffusion transformErs framework.
arXiv Detail & Related papers (2025-03-10T08:35:51Z) - GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [44.50428701650495]
We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation.<n> GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations.<n>We propose the Grounded ECG task, a clinically motivated benchmark designed to assess the MLLM's capability in grounded ECG understanding.
arXiv Detail & Related papers (2025-03-08T05:48:53Z) - Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification [7.005068872406135]
Recent advancements in automatic speaker verification (ASV) studies have been achieved by leveraging large-scale pretrained networks.
We present a novel approach for exploiting the multilayered nature of pretrained models for ASV.
We show how the proposed interlayer processing aids in maximizing the advantage of utilizing pretrained models.
arXiv Detail & Related papers (2024-09-12T05:55:32Z) - Spatio-Temporal Encoding of Brain Dynamics with Surface Masked Autoencoders [10.097983222759884]
Surface Masked AutoEncoder (sMAE) and surface Masked AutoEncoder (MAE)
These models are trained to reconstruct cortical feature maps from masked versions of the input by learning strong latent representations of cortical development and structure function.
Results show that (v)sMAE pre-trained models improve phenotyping prediction performance on multiple tasks by $ge 26%$, and offer faster convergence relative to models trained from scratch.
arXiv Detail & Related papers (2023-08-10T10:01:56Z) - ViT2EEG: Leveraging Hybrid Pretrained Vision Transformers for EEG Data [0.0]
We demonstrate the application of a hybrid Vision Transformer (ViT) model, pretrained on ImageNet, on an electroencephalogram (EEG) regression task.
This model shows a notable increase in performance compared to other models, including an identical architecture ViT trained without the ImageNet weights.
arXiv Detail & Related papers (2023-08-01T11:10:33Z) - Learning Joint Latent Space EBM Prior Model for Multi-layer Generator [44.4434704520236]
We study the fundamental problem of learning multi-layer generator models.
We propose an energy-based model (EBM) on the joint latent space over all layers of latent variables.
Our experiments demonstrate that the learned model can be expressive in generating high-quality images.
arXiv Detail & Related papers (2023-06-10T00:27:37Z) - PulseNet: Deep Learning ECG-signal classification using random
augmentation policy and continous wavelet transform for canines [46.09869227806991]
evaluating canine electrocardiograms (ECG) require skilled veterinarians.
Current availability of veterinary cardiologists for ECG interpretation and diagnostic support is limited.
We implement a deep convolutional neural network (CNN) approach for classifying canine electrocardiogram sequences as either normal or abnormal.
arXiv Detail & Related papers (2023-05-17T09:06:39Z) - STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition [50.064502884594376]
We study the problem of human action recognition using motion capture (MoCap) sequences.
We propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences.
The proposed method achieves state-of-the-art performance compared to skeleton-based and point-cloud-based models.
arXiv Detail & Related papers (2023-03-31T16:19:27Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - ViViT: A Video Vision Transformer [75.74690759089529]
We present pure-transformer based models for video classification.
Our model extracts-temporal tokens from the input video, which are then encoded by a series of transformer layers.
We show how we can effectively regularise the model during training and leverage pretrained image models to be able to train on comparatively small datasets.
arXiv Detail & Related papers (2021-03-29T15:27:17Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z) - Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR)
Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data.
We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z) - A Multi-Scale Tensor Network Architecture for Classification and
Regression [0.0]
We present an algorithm for supervised learning using tensor networks.
We employ a step of preprocessing the data by coarse-graining through a sequence of wavelet transformations.
We show how fine-graining through the network may be used to initialize models with access to finer-scale features.
arXiv Detail & Related papers (2020-01-22T21:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.