Compound Figure Separation of Biomedical Images: Mining Large Datasets
for Self-supervised Learning
- URL: http://arxiv.org/abs/2208.14357v1
- Date: Tue, 30 Aug 2022 16:02:34 GMT
- Title: Compound Figure Separation of Biomedical Images: Mining Large Datasets
for Self-supervised Learning
- Authors: Tianyuan Yao, Chang Qu, Jun Long, Quan Liu, Ruining Deng, Yuanhan
Tian, Jiachen Xu, Aadarsh Jha, Zuhayr Asad, Shunxing Bao, Mengyang Zhao,
Agnes B. Fogo, Bennett A.Landman, Haichun Yang, Catie Chang, Yuankai Huo
- Abstract summary: We introduce a simulation-based training framework that minimizes the need for resource extensive bounding box annotations.
We also propose a new side loss that is optimized for compound figure separation.
This is the first study that evaluates the efficacy of leveraging self-supervised learning with compound image separation.
- Score: 12.445324044675116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid development of self-supervised learning (e.g., contrastive
learning), the importance of having large-scale images (even without
annotations) for training a more generalizable AI model has been widely
recognized in medical image analysis. However, collecting large-scale
task-specific unannotated data at scale can be challenging for individual labs.
Existing online resources, such as digital books, publications, and search
engines, provide a new resource for obtaining large-scale images. However,
published images in healthcare (e.g., radiology and pathology) consist of a
considerable amount of compound figures with subplots. In order to extract and
separate compound figures into usable individual images for downstream
learning, we propose a simple compound figure separation (SimCFS) framework
without using the traditionally required detection bounding box annotations,
with a new loss function and a hard case simulation. Our technical contribution
is four-fold: (1) we introduce a simulation-based training framework that
minimizes the need for resource extensive bounding box annotations; (2) we
propose a new side loss that is optimized for compound figure separation; (3)
we propose an intra-class image augmentation method to simulate hard cases; and
(4) to the best of our knowledge, this is the first study that evaluates the
efficacy of leveraging self-supervised learning with compound image separation.
From the results, the proposed SimCFS achieved state-of-the-art performance on
the ImageCLEF 2016 Compound Figure Separation Database. The pretrained
self-supervised learning model using large-scale mined figures improved the
accuracy of downstream image classification tasks with a contrastive learning
algorithm. The source code of SimCFS is made publicly available at
https://github.com/hrlblab/ImageSeperation.
Related papers
- Scaling Laws of Synthetic Images for Model Training ... for Now [54.43596959598466]
We study the scaling laws of synthetic images generated by state of the art text-to-image models.
We observe that synthetic images demonstrate a scaling trend similar to, but slightly less effective than, real images in CLIP training.
arXiv Detail & Related papers (2023-12-07T18:59:59Z) - Self-Supervised Pre-Training with Contrastive and Masked Autoencoder
Methods for Dealing with Small Datasets in Deep Learning for Medical Imaging [8.34398674359296]
Deep learning in medical imaging has the potential to minimize the risk of diagnostic errors, reduce radiologist workload, and accelerate diagnosis.
Training such deep learning models requires large and accurate datasets, with annotations for all training samples.
To address this challenge, deep learning models can be pre-trained on large image datasets without annotations using methods from the field of self-supervised learning.
arXiv Detail & Related papers (2023-08-12T11:31:01Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Physiology-based simulation of the retinal vasculature enables
annotation-free segmentation of OCT angiographs [8.596819713822477]
We present a pipeline to synthesize large amounts of realistic OCTA images with intrinsically matching ground truth labels.
Our proposed method is based on two novel components: 1) a physiology-based simulation that models the various retinal plexuses and 2) a suite of physics-based image augmentations.
arXiv Detail & Related papers (2022-07-22T14:22:22Z) - Exemplar Learning for Medical Image Segmentation [38.61378161105941]
We propose an Exemplar Learning-based Synthesis Net (ELSNet) framework for medical image segmentation.
ELSNet introduces two new modules for image segmentation: an exemplar-guided synthesis module and a pixel-prototype based contrastive embedding module.
We conduct experiments on several organ segmentation datasets and present an in-depth analysis.
arXiv Detail & Related papers (2022-04-03T00:10:06Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Self-Supervised Generative Style Transfer for One-Shot Medical Image
Segmentation [10.634870214944055]
In medical image segmentation, supervised deep networks' success comes at the cost of requiring abundant labeled data.
We propose a novel volumetric self-supervised learning for data augmentation capable of synthesizing volumetric image-segmentation pairs.
Our work's central tenet benefits from a combined view of one-shot generative learning and the proposed self-supervised training strategy.
arXiv Detail & Related papers (2021-10-05T15:28:42Z) - Compound Figure Separation of Biomedical Images with Side Loss [7.037505559439388]
In medical image analysis, even unannotated data can be difficult to obtain for individual labs.
We propose a simple compound figure separation (SimCFS) framework that uses weak classification annotations from individual images.
SimCFS achieved a new state-of-the-art performance on the ImageCLEF 2016 Compound Figure Separation Database.
arXiv Detail & Related papers (2021-07-19T07:16:32Z) - Fed-Sim: Federated Simulation for Medical Imaging [131.56325440976207]
We introduce a physics-driven generative approach that consists of two learnable neural modules.
We show that our data synthesis framework improves the downstream segmentation performance on several datasets.
arXiv Detail & Related papers (2020-09-01T19:17:46Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.