Related papers: Abnormality-Driven Representation Learning for Radiology Imaging

Abnormality-Driven Representation Learning for Radiology Imaging

URL: http://arxiv.org/abs/2411.16803v1
Date: Mon, 25 Nov 2024 13:53:26 GMT
Title: Abnormality-Driven Representation Learning for Radiology Imaging
Authors: Marta Ligero, Tim Lenz, Georg Wölflein, Omar S. M. El Nahhas, Daniel Truhn, Jakob Nikolas Kather,
Abstract summary: We introduce lesion-enhanced contrastive learning (LeCL), a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans. We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models.
Score: 0.8321462983924758
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To date, the most common approach for radiology deep learning pipelines is the use of end-to-end 3D networks based on models pre-trained on other tasks, followed by fine-tuning on the task at hand. In contrast, adjacent medical fields such as pathology, which focus on 2D images, have effectively adopted task-agnostic foundational models based on self-supervised learning (SSL), combined with weakly-supervised deep learning (DL). However, the field of radiology still lacks task-agnostic representation models due to the computational and data demands of 3D imaging and the anatomical complexity inherent to radiology scans. To address this gap, we propose CLEAR, a framework for radiology images that uses extracted embeddings from 2D slices along with attention-based aggregation for efficiently predicting clinical endpoints. As part of this framework, we introduce lesion-enhanced contrastive learning (LeCL), a novel approach to obtain visual representations driven by abnormalities in 2D axial slices across different locations of the CT scans. Specifically, we trained single-domain contrastive learning approaches using three different architectures: Vision Transformers, Vision State Space Models and Gated Convolutional Neural Networks. We evaluate our approach across three clinical tasks: tumor lesion location, lung disease detection, and patient staging, benchmarking against four state-of-the-art foundation models, including BiomedCLIP. Our findings demonstrate that CLEAR using representations learned through LeCL, outperforms existing foundation models, while being substantially more compute- and data-efficient.

Related papers

Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining [0.8714814768600079]
We introduce a novel architecture for Text-to-CT generation that combines a latent diffusion model with a 3D contrastive vision-language pretraining scheme.<n>Our method offers a scalable and controllable solution for synthesizing clinically meaningful CT volumes from text.
arXiv Detail & Related papers (2025-05-31T16:41:55Z)
Screener: Self-supervised Pathology Segmentation Model for 3D Medical Images [7.466495250192545]
We frame pathology segmentation as an unsupervised visual anomaly segmentation problem. We enhance the existing density-based UVAS framework with two key innovations. Our model, Screener, outperforms existing UVAS methods on four large-scale test datasets.
arXiv Detail & Related papers (2025-02-12T11:37:35Z)
3D Foundation AI Model for Generalizable Disease Detection in Head Computed Tomography [8.896955286474991]
We introduce FM-CT: a Foundation Model for Head CT for generalizable disease detection, trained using self-supervised learning. Our approach pre-trains a deep learning model on a large, diverse dataset of 361,663 non-contrast 3D head CT scans without the need for manual annotations. Our results demonstrate that the self-supervised foundation model significantly improves performance on downstream diagnostic tasks.
arXiv Detail & Related papers (2025-02-04T23:42:18Z)
CABLD: Contrast-Agnostic Brain Landmark Detection with Consistency-Based Regularization [2.423045468361048]
We introduce CABLD, a novel self-supervised deep learning framework for 3D brain landmark detection in unlabeled scans. We demonstrate the proposed method with the intricate task of MRI-based 3D brain landmark detection. Our framework provides a robust and accurate solution for anatomical landmark detection, reducing the need for extensively annotated datasets.
arXiv Detail & Related papers (2024-11-26T19:56:29Z)
Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering. Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively. We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z)
Artifact Reduction in 3D and 4D Cone-beam Computed Tomography Images with Deep Learning -- A Review [0.0]
Deep learning techniques have been used to improve image quality in cone-beam computed tomography (CBCT) We provide an overview of deep learning techniques that have successfully been shown to reduce artifacts in 3D, as well as in time-resolved (4D) CBCT. One of the key findings of this work is an observed trend towards the use of generative models including GANs and score-based or diffusion models.
arXiv Detail & Related papers (2024-03-27T13:46:01Z)
Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical Segmentation [0.8897689150430447]
We conduct the first systematic benchmark study for variants of 3D U-shaped models. Our study examines the impact of different attention mechanisms, the number of resolution stages, and network configurations on segmentation accuracy and computational complexity.
arXiv Detail & Related papers (2024-02-05T17:43:02Z)
Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images. We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations. The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z)
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z)
Deep learning network to correct axial and coronal eye motion in 3D OCT retinal imaging [65.47834983591957]
We propose deep learning based neural networks to correct axial and coronal motion artifacts in OCT based on a single scan. The experimental result shows that the proposed method can effectively correct motion artifacts and achieve smaller error than other methods.
arXiv Detail & Related papers (2023-05-27T03:55:19Z)
Successive Subspace Learning for Cardiac Disease Classification with Two-phase Deformation Fields from Cine MRI [36.044984400761535]
This work proposes a lightweight successive subspace learning framework for CVD classification. It is based on an interpretable feedforward design, in conjunction with a cardiac atlas. Compared with 3D CNN-based approaches, our framework achieves superior classification performance with 140$times$ fewer parameters.
arXiv Detail & Related papers (2023-01-21T15:00:59Z)
Slice-level Detection of Intracranial Hemorrhage on CT Using Deep Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis. We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge. The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z)
SD-LayerNet: Semi-supervised retinal layer segmentation in OCT using disentangled representation with anatomical priors [4.2663199451998475]
We introduce a semi-supervised paradigm into the retinal layer segmentation task. In particular, a novel fully differentiable approach is used for converting surface position regression into a pixel-wise structured segmentation. In parallel, we propose a set of anatomical priors to improve network training when a limited amount of labeled data is available.
arXiv Detail & Related papers (2022-07-01T14:30:59Z)
TSGCNet: Discriminative Geometric Feature Learning with Two-Stream GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes. We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z)
Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.