Digitizing Paper ECGs at Scale: An Open-Source Algorithm for Clinical Research
- URL: http://arxiv.org/abs/2510.19590v1
- Date: Wed, 22 Oct 2025 13:41:21 GMT
- Title: Digitizing Paper ECGs at Scale: An Open-Source Algorithm for Clinical Research
- Authors: Elias Stenhede, Agnar Martin Bjørnstad, Arian Ranjbar,
- Abstract summary: We introduce a fully automated, modular framework that converts scanned or photographed ECGs into digital signals.<n>The framework is validated on 37,191 ECG images with 1,596 collected at Akershus University Hospital.<n>We hope the software will contribute to unlocking retrospective ECG archives and democratize access to AI-driven diagnostics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Millions of clinical ECGs exist only as paper scans, making them unusable for modern automated diagnostics. We introduce a fully automated, modular framework that converts scanned or photographed ECGs into digital signals, suitable for both clinical and research applications. The framework is validated on 37,191 ECG images with 1,596 collected at Akershus University Hospital, where the algorithm obtains a mean signal-to-noise ratio of 19.65 dB on scanned papers with common artifacts. It is further evaluated on the Emory Paper Digitization ECG Dataset, comprising 35,595 images, including images with perspective distortion, wrinkles, and stains. The model improves on the state-of-the-art in all subcategories. The full software is released as open-source, promoting reproducibility and further development. We hope the software will contribute to unlocking retrospective ECG archives and democratize access to AI-driven diagnostics.
Related papers
- A Deep Learning Pipeline Using Synthetic Data to Improve Interpretation of Paper ECG Images [8.559073054541754]
Cardiovascular diseases (CVDs) are the leading global cause of death, and early detection is essential to improve patient outcomes.<n>We propose a deep learning framework designed specifically to classify paper-like ECG images into five main diagnostic categories.<n>Our method was the winning entry to the 2024 British Heart Foundation Open Data Science Challenge.
arXiv Detail & Related papers (2025-07-29T16:16:17Z) - Comparing Deep Neural Network for Multi-Label ECG Diagnosis From Scanned ECG [1.2499537119440243]
We evaluate the performance of multiple deep neural network architectures, including AlexNet, VGG, ResNet, and Vision Transformer, on scanned ECG datasets.<n>Our comparative analysis examines model accuracy, robustness to image artifacts, and generalizability across different ECG conditions.<n>The findings highlight the strengths and limitations of each architecture, providing insights into the feasibility of image-based ECG diagnosis.
arXiv Detail & Related papers (2025-02-19T02:56:27Z) - ECG-Image-Database: A Dataset of ECG Images with Real-World Imaging and Scanning Artifacts; A Foundation for Computerized ECG Image Digitization and Analysis [4.263536786122581]
ECG-Image-Database is a large and diverse collection of electrocardiogram (ECG) images generated from ECG time-series data.
We used ECG-Image-Kit, an open-source Python toolkit, to generate realistic images of 12-lead ECG printouts from raw ECG time-series.
The resulting dataset includes 35,595 software-labeled ECG images with a wide range of imaging artifacts and distortions.
arXiv Detail & Related papers (2024-09-25T04:30:19Z) - ECG-Image-Kit: A Synthetic Image Generation Toolbox to Facilitate Deep
Learning-Based Electrocardiogram Digitization [3.4579920352329787]
We introduce ECG-Image-Kit, an open-source toolbox for generating synthetic multi-lead ECG images with realistic artifacts from time-series data.
As a case study, we used ECG-Image-Kit to create a dataset of 21,801 ECG images from the PhysioNet QT database.
We trained a combination of a traditional computer vision and deep neural network model on this dataset to convert synthetic images into time-series data.
arXiv Detail & Related papers (2023-07-04T22:42:55Z) - Auto Lead Extraction and Digitization of ECG Paper Records using cGAN [0.23624125155742054]
ECG signals are generally stored in paper form, which makes it difficult to store and analyze the data.
We propose a deep learning-based model for individually extracting all 12 leads from 12-lead ECG images.
We also propose a method to convert the paper ECG format into a storable digital format.
arXiv Detail & Related papers (2022-11-12T18:36:29Z) - Optimising Chest X-Rays for Image Analysis by Identifying and Removing
Confounding Factors [49.005337470305584]
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions.
The variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance.
We propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases.
arXiv Detail & Related papers (2022-08-22T13:57:04Z) - Self-Supervised Multi-Modal Alignment for Whole Body Medical Imaging [70.52819168140113]
We use a dataset of over 20,000 subjects from the UK Biobank with both whole body Dixon technique magnetic resonance (MR) scans and also dual-energy x-ray absorptiometry (DXA) scans.
We introduce a multi-modal image-matching contrastive framework, that is able to learn to match different-modality scans of the same subject with high accuracy.
Without any adaption, we show that the correspondences learnt during this contrastive training step can be used to perform automatic cross-modal scan registration.
arXiv Detail & Related papers (2021-07-14T12:35:05Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z) - Y-Net for Chest X-Ray Preprocessing: Simultaneous Classification of
Geometry and Segmentation of Annotations [70.0118756144807]
This work introduces a general pre-processing step for chest x-ray input into machine learning algorithms.
A modified Y-Net architecture based on the VGG11 encoder is used to simultaneously learn geometric orientation and segmentation of radiographs.
Results were evaluated by expert clinicians, with acceptable geometry in 95.8% and annotation mask in 96.2%, compared to 27.0% and 34.9% respectively in control images.
arXiv Detail & Related papers (2020-05-08T02:16:17Z) - Review of Artificial Intelligence Techniques in Imaging Data
Acquisition, Segmentation and Diagnosis for COVID-19 [71.41929762209328]
The pandemic of coronavirus disease 2019 (COVID-19) is spreading all over the world.
Medical imaging such as X-ray and computed tomography (CT) plays an essential role in the global fight against COVID-19.
The recently emerging artificial intelligence (AI) technologies further strengthen the power of the imaging tools and help medical specialists.
arXiv Detail & Related papers (2020-04-06T15:21:34Z) - VerSe: A Vertebrae Labelling and Segmentation Benchmark for
Multi-detector CT Images [121.31355003451152]
Large Scale Vertebrae Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020.
We present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view.
arXiv Detail & Related papers (2020-01-24T21:09:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.