Towards Resource-Efficient Streaming of Large-Scale Medical Image Datasets for Deep Learning
- URL: http://arxiv.org/abs/2307.00438v2
- Date: Sat, 01 Feb 2025 06:16:55 GMT
- Title: Towards Resource-Efficient Streaming of Large-Scale Medical Image Datasets for Deep Learning
- Authors: Pranav Kulkarni, Adway Kanhere, Eliot Siegel, Paul H. Yi, Vishwa S. Parekh,
- Abstract summary: Medical Image Streaming Toolkit (MIST) enables streaming of medical images at different resolutions and formats from a single high-resolution copy.<n>MIST reduces storage and bandwidth requirements for hosting and downloading datasets without impacting image quality.
- Score: 3.8129962526689702
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale medical imaging datasets have accelerated deep learning (DL) for medical image analysis. However, the large scale of these datasets poses a challenge for researchers, resulting in increased storage and bandwidth requirements for hosting and accessing them. Since different researchers have different use cases and require different resolutions or formats for DL, it is neither feasible to anticipate every researcher's needs nor practical to store data in multiple resolutions and formats. To that end, we propose the Medical Image Streaming Toolkit (MIST), a format-agnostic database that enables streaming of medical images at different resolutions and formats from a single high-resolution copy. We evaluated MIST across eight popular, large-scale medical imaging datasets spanning different body parts, modalities, and formats. Our results showed that our framework reduced the storage and bandwidth requirements for hosting and downloading datasets without impacting image quality. We demonstrate that MIST addresses the challenges posed by large-scale medical imaging datasets by building a data-efficient and format-agnostic database to meet the diverse needs of researchers and reduce barriers to DL research in medical imaging.
Related papers
- MedIL: Implicit Latent Spaces for Generating Heterogeneous Medical Images at Arbitrary Resolutions [2.2427832125073732]
MedIL is a first-of-its-kind autoencoder built for encoding medical images with heterogeneous sizes and resolutions.
We show how MedIL compresses and preserves clinically-relevant features over large multi-site, multi-resolution datasets.
arXiv Detail & Related papers (2025-04-12T19:52:56Z) - RadIR: A Scalable Framework for Multi-Grained Medical Image Retrieval via Radiology Report Mining [64.66825253356869]
We propose a novel methodology that leverages dense radiology reports to define image-wise similarity ordering at multiple granularities.<n>We construct two comprehensive medical imaging retrieval datasets: MIMIC-IR for Chest X-rays and CTRATE-IR for CT scans.<n>We develop two retrieval systems, RadIR-CXR and model-ChestCT, which demonstrate superior performance in traditional image-image and image-report retrieval tasks.
arXiv Detail & Related papers (2025-03-06T17:43:03Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.
We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities [68.12889379702824]
Vision-Language Models (VLMs) trained via contrastive learning have achieved notable success in natural image tasks.
UniMed is a large-scale, open-source multi-modal medical dataset comprising over 5.3 million image-text pairs.
We trained UniMed-CLIP, a unified VLM for six modalities, achieving notable gains in zero-shot evaluations.
arXiv Detail & Related papers (2024-12-13T18:59:40Z) - Large-Scale Data-Free Knowledge Distillation for ImageNet via Multi-Resolution Data Generation [53.95204595640208]
Data-Free Knowledge Distillation (DFKD) is an advanced technique that enables knowledge transfer from a teacher model to a student model without relying on original training data.
Previous approaches have generated synthetic images at high resolutions without leveraging information from real images.
MUSE generates images at lower resolutions while using Class Activation Maps (CAMs) to ensure that the generated images retain critical, class-specific features.
arXiv Detail & Related papers (2024-11-26T02:23:31Z) - BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval [44.92177279141073]
We present a dataset of 8,069 3D CT volumes, encompassing over 2 million slices, paired with their respective radiological reports.
We then craft a retrieval strategy, MedFinder, harnessing the potential of large language models.
It marks our preliminary step towards developing a system capable of facilitating text-to-image, image-to-text, and keyword-based retrieval tasks.
arXiv Detail & Related papers (2024-03-24T03:10:07Z) - Building Universal Foundation Models for Medical Image Analysis with
Spatially Adaptive Networks [5.661631789478932]
We propose a universal foundation model for medical image analysis that processes images with heterogeneous spatial properties using a unified structure.
We pre-train a spatial adaptive visual tokenizer (SPAD-VT) and then a spatial adaptive Vision Transformer (SPAD-ViT) via masked image modeling (MIM) on 55 public medical image datasets.
The experimental results on downstream medical image classification and segmentation tasks demonstrate the superior performance and label efficiency of our model.
arXiv Detail & Related papers (2023-12-12T08:33:45Z) - Recurrent Multi-scale Transformer for High-Resolution Salient Object
Detection [68.65338791283298]
Salient Object Detection (SOD) aims to identify and segment the most conspicuous objects in an image or video.
Traditional SOD methods are largely limited to low-resolution images, making them difficult to adapt to the development of High-Resolution SOD.
In this work, we first propose a new HRS10K dataset, which contains 10,500 high-quality annotated images at 2K-8K resolution.
arXiv Detail & Related papers (2023-08-07T17:49:04Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Advanced Medical Image Representation for Efficient Processing and
Transfer in Multisite Clouds [0.6091702876917281]
Human brain databases at medical institutes can accumulate tens of Terabytes of data per year.
We propose a novel medical image format representation based on multiple data structures that improve the information maintained in the medical images.
arXiv Detail & Related papers (2023-04-29T18:09:17Z) - ResFormer: Scaling ViTs with Multi-Resolution Training [100.01406895070693]
We introduce ResFormer, a framework for improved performance on a wide spectrum of, mostly unseen, testing resolutions.
In particular, ResFormer operates on replicated images of different resolutions and enforces a scale consistency loss to engage interactive information across different scales.
We demonstrate, moreover, ResFormer is flexible and can be easily extended to semantic segmentation, object detection and video action recognition.
arXiv Detail & Related papers (2022-12-01T18:57:20Z) - Engineering AI Tools for Systematic and Scalable Quality Assessment in
Magnetic Resonance Imaging [0.0]
Building a big MRI data repository has multiple challenges related to privacy, data size, DICOM format, logistics, and non-standardized images.
Not only building the data repository is difficult, but using data pooled from the repository is also challenging.
This position paper describes challenges in constructing a large MRI data repository and using data downloaded from such data repositories in various aspects.
arXiv Detail & Related papers (2021-12-02T22:47:16Z) - Hard-Attention for Scalable Image Classification [16.8359205877213]
We show that multi-scale hard-attention can be an effective solution to this problem.
We propose a novel architecture, TNet, which traverses an image pyramid in a top-down fashion.
We show that our model attends only to a fraction of the highest resolution content, while using only image-level labels without bounding boxes.
arXiv Detail & Related papers (2021-02-20T00:21:28Z) - Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.
In this paper, we develop a novel generative method named generative adversarial U-Net.
Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z) - Cross-Modal Information Maximization for Medical Imaging: CMIM [62.28852442561818]
In hospitals, data are siloed to specific information systems that make the same information available under different modalities.
This offers unique opportunities to obtain and use at train-time those multiple views of the same information that might not always be available at test-time.
We propose an innovative framework that makes the most of available data by learning good representations of a multi-modal input that are resilient to modality dropping at test-time.
arXiv Detail & Related papers (2020-10-20T20:05:35Z) - Memory-efficient GAN-based Domain Translation of High Resolution 3D
Medical Images [0.15092198588928965]
Generative adversarial networks (GANs) are rarely applied on 3D medical images of large size.
The present work proposes a multi-scale patch-based GAN approach for establishing unpaired domain translation.
The evaluation of the domain translation scenarios is performed on brain MRIs of size 155x240x240 and thorax CTs of size up to 512x512x512.
arXiv Detail & Related papers (2020-10-06T08:43:27Z) - Universal Model for Multi-Domain Medical Image Retrieval [88.67940265012638]
Medical Image Retrieval (MIR) helps doctors quickly find similar patients' data.
MIR is becoming increasingly helpful due to the wide use of digital imaging modalities.
However, the popularity of various digital imaging modalities in hospitals also poses several challenges to MIR.
arXiv Detail & Related papers (2020-07-14T23:22:04Z) - Anysize GAN: A solution to the image-warping problem [5.866114531330298]
We propose a new type of General Adversarial Network (GAN) to resolve a common issue with Deep Learning.
We develop a novel architecture that can be applied to existing latent vector based GAN structures that allows them to generate on-the-fly images of any size.
We demonstrate our method can successfully generate realistic images at different sizes without issue, preserving and understanding spatial relationships, while maintaining feature relationships.
arXiv Detail & Related papers (2020-03-06T14:18:42Z) - Learning When and Where to Zoom with Deep Reinforcement Learning [101.79271767464947]
We propose a reinforcement learning approach to identify when and where to use/acquire high resolution data conditioned on paired, cheap, low resolution images.
We conduct experiments on CIFAR10, CIFAR100, ImageNet and fMoW datasets where we use significantly less high resolution data while maintaining similar accuracy to models which use full high resolution images.
arXiv Detail & Related papers (2020-03-01T07:16:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.