Learning General-Purpose Biomedical Volume Representations using Randomized Synthesis
- URL: http://arxiv.org/abs/2411.02372v1
- Date: Mon, 04 Nov 2024 18:40:46 GMT
- Title: Learning General-Purpose Biomedical Volume Representations using Randomized Synthesis
- Authors: Neel Dey, Benjamin Billot, Hallee E. Wong, Clinton J. Wang, Mengwei Ren, P. Ellen Grant, Adrian V. Dalca, Polina Golland,
- Abstract summary: Current biomedical foundation models struggle to generalize as public 3D datasets are small.
We propose a data engine that synthesizes highly variable training samples that enable generalization to new biomedical contexts.
To then train a single 3D network for any voxel-level task, we develop a contrastive learning method that pretrains the network to be stable against nuisance imaging variation simulated by the data engine.
- Score: 9.355513913682794
- License:
- Abstract: Current volumetric biomedical foundation models struggle to generalize as public 3D datasets are small and do not cover the broad diversity of medical procedures, conditions, anatomical regions, and imaging protocols. We address this by creating a representation learning method that instead anticipates strong domain shifts at training time itself. We first propose a data engine that synthesizes highly variable training samples that enable generalization to new biomedical contexts. To then train a single 3D network for any voxel-level task, we develop a contrastive learning method that pretrains the network to be stable against nuisance imaging variation simulated by the data engine, a key inductive bias for generalization. This network's features can be used as robust representations of input images for downstream tasks and its weights provide a strong, dataset-agnostic initialization for finetuning on new datasets. As a result, we set new standards across both multimodality registration and few-shot segmentation, a first for any 3D biomedical vision model, all without (pre-)training on any existing dataset of real images.
Related papers
- Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation [16.753957522664713]
Masked Autoencoders (MAEs) have been shown to be effective in pre-training Vision Transformers (ViTs) for natural and medical image analysis problems.
Existing MAE pre-training methods, which were specifically developed with the ViT architecture, lack the ability to capture geometric shape and spatial information.
We propose a novel extension of known MAEs for self pre-training (i.e., models pre-trained on the same target dataset) for 3D medical image segmentation.
arXiv Detail & Related papers (2024-06-15T06:15:17Z) - Generalizing Medical Image Representations via Quaternion Wavelet Networks [9.836302410524842]
We introduce a novel, generalizable, data- and task-agnostic framework able to extract salient features from medical images.
The proposed quaternion wavelet network (QUAVE) can be easily integrated with any pre-existing medical image analysis or synthesis task.
arXiv Detail & Related papers (2023-10-16T09:34:06Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Learning to Exploit Temporal Structure for Biomedical Vision-Language
Processing [53.89917396428747]
Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities.
We explicitly account for prior images and reports when available during both training and fine-tuning.
Our approach, named BioViL-T, uses a CNN-Transformer hybrid multi-image encoder trained jointly with a text model.
arXiv Detail & Related papers (2023-01-11T16:35:33Z) - 3D fluorescence microscopy data synthesis for segmentation and
benchmarking [0.9922927990501083]
Conditional generative adversarial networks can be utilized to generate realistic image data for 3D fluorescence microscopy.
An additional positional conditioning of the cellular structures enables the reconstruction of position-dependent intensity characteristics.
A patch-wise working principle and a subsequent full-size reassemble strategy is used to generate image data of arbitrary size and different organisms.
arXiv Detail & Related papers (2021-07-21T16:08:56Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Domain Generalization for Medical Imaging Classification with
Linear-Dependency Regularization [59.5104563755095]
We introduce a simple but effective approach to improve the generalization capability of deep neural networks in the field of medical imaging classification.
Motivated by the observation that the domain variability of the medical images is to some extent compact, we propose to learn a representative feature space through variational encoding.
arXiv Detail & Related papers (2020-09-27T12:30:30Z) - Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules.
We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z) - Realistic Adversarial Data Augmentation for MR Image Segmentation [17.951034264146138]
We propose an adversarial data augmentation method for training neural networks for medical image segmentation.
Our model generates plausible and realistic signal corruptions, which models the intensity inhomogeneities caused by a common type of artefacts in MR imaging: bias field.
We show that such an approach can improve the ability generalization and robustness of models as well as provide significant improvements in low-data scenarios.
arXiv Detail & Related papers (2020-06-23T20:43:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.