Understanding the Tricks of Deep Learning in Medical Image Segmentation:
Challenges and Future Directions
- URL: http://arxiv.org/abs/2209.10307v2
- Date: Mon, 8 May 2023 10:23:24 GMT
- Title: Understanding the Tricks of Deep Learning in Medical Image Segmentation:
Challenges and Future Directions
- Authors: Dong Zhang, Yi Lin, Hao Chen, Zhuotao Tian, Xin Yang, Jinhui Tang,
Kwang Ting Cheng
- Abstract summary: In this paper, we collect a series of MedISeg tricks for different model implementation phases.
We experimentally explore the effectiveness of these tricks on consistent baselines.
We also open-sourced a strong MedISeg repository, where each component has the advantage of plug-and-play.
- Score: 66.40971096248946
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past few years, the rapid development of deep learning technologies
for computer vision has significantly improved the performance of medical image
segmentation (MedISeg). However, the diverse implementation strategies of
various models have led to an extremely complex MedISeg system, resulting in a
potential problem of unfair result comparisons. In this paper, we collect a
series of MedISeg tricks for different model implementation phases (i.e.,
pre-training model, data pre-processing, data augmentation, model
implementation, model inference, and result post-processing), and
experimentally explore the effectiveness of these tricks on consistent
baselines. With the extensive experimental results on both the representative
2D and 3D medical image datasets, we explicitly clarify the effect of these
tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong
MedISeg repository, where each component has the advantage of plug-and-play. We
believe that this milestone work not only completes a comprehensive and
complementary survey of the state-of-the-art MedISeg approaches, but also
offers a practical guide for addressing the future medical image processing
challenges including but not limited to small dataset, class imbalance
learning, multi-modality learning, and domain adaptation. The code and training
weights have been released at: https://github.com/hust-linyi/seg_trick.
Related papers
- LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model [55.80651780294357]
State-of-the-art medical multi-modal large language models (med-MLLM) leverage instruction-following data in pre-training.
LoGra-Med is a new multi-graph alignment algorithm that enforces triplet correlations across image modalities, conversation-based descriptions, and extended captions.
Our results show LoGra-Med matches LLAVA-Med performance on 600K image-text pairs for Medical VQA and significantly outperforms it when trained on 10% of the data.
arXiv Detail & Related papers (2024-10-03T15:52:03Z) - Coupling AI and Citizen Science in Creation of Enhanced Training Dataset for Medical Image Segmentation [3.7274206780843477]
We introduce a robust and versatile framework that combines AI and crowdsourcing to improve the quality and quantity of medical image datasets.
Our approach utilise a user-friendly online platform that enables a diverse group of crowd annotators to label medical images efficiently.
We employ pix2pixGAN, a generative AI model, to expand the training dataset with synthetic images that capture realistic morphological features.
arXiv Detail & Related papers (2024-09-04T21:22:54Z) - Disease Classification and Impact of Pretrained Deep Convolution Neural Networks on Diverse Medical Imaging Datasets across Imaging Modalities [0.0]
This paper investigates the intricacies of using pretrained deep convolutional neural networks with transfer learning across diverse medical imaging datasets.
It shows that the use of pretrained models as fixed feature extractors yields poor performance irrespective of the datasets.
It is also found that deeper and more complex architectures did not necessarily result in the best performance.
arXiv Detail & Related papers (2024-08-30T04:51:19Z) - MIST: A Simple and Scalable End-To-End 3D Medical Imaging Segmentation Framework [1.1608974088441382]
The Medical Imaging Toolkit (MIST) is designed to facilitate consistent training, testing, and evaluation of deep learning-based medical imaging segmentation methods.
MIST standardizes data analysis, preprocessing, and evaluation pipelines, accommodating multiple architectures and loss functions.
arXiv Detail & Related papers (2024-07-31T05:17:31Z) - DeepMediX: A Deep Learning-Driven Resource-Efficient Medical Diagnosis
Across the Spectrum [15.382184404673389]
This work presents textttDeepMediX, a groundbreaking, resource-efficient model that significantly addresses this challenge.
Built on top of the MobileNetV2 architecture, DeepMediX excels in classifying brain MRI scans and skin cancer images.
DeepMediX's design also includes the concept of Federated Learning, enabling a collaborative learning approach without compromising data privacy.
arXiv Detail & Related papers (2023-07-01T12:30:58Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Domain Generalization for Mammographic Image Analysis with Contrastive
Learning [62.25104935889111]
The training of an efficacious deep learning model requires large data with diverse styles and qualities.
A novel contrastive learning is developed to equip the deep learning models with better style generalization capability.
The proposed method has been evaluated extensively and rigorously with mammograms from various vendor style domains and several public datasets.
arXiv Detail & Related papers (2023-04-20T11:40:21Z) - BenchMD: A Benchmark for Unified Learning on Medical Images and Sensors [8.695342954247606]
We present BenchMD, a benchmark that tests how well unified, modality-agnostic methods, including architectures and training techniques, perform on a diverse array of medical tasks.
Our baseline results demonstrate that no unified learning technique achieves strong performance across all modalities, leaving ample room for improvement on the benchmark.
arXiv Detail & Related papers (2023-04-17T17:59:26Z) - Competence-based Multimodal Curriculum Learning for Medical Report
Generation [98.10763792453925]
We propose a Competence-based Multimodal Curriculum Learning framework ( CMCL) to alleviate the data bias and make best use of available data.
Specifically, CMCL simulates the learning process of radiologists and optimize the model in a step by step manner.
Experiments on the public IU-Xray and MIMIC-CXR datasets show that CMCL can be incorporated into existing models to improve their performance.
arXiv Detail & Related papers (2022-06-24T08:16:01Z) - Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.
In this paper, we develop a novel generative method named generative adversarial U-Net.
Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.