Label-Efficient Deep Learning in Medical Image Analysis: Challenges and
Future Directions
- URL: http://arxiv.org/abs/2303.12484v4
- Date: Wed, 20 Dec 2023 02:14:25 GMT
- Title: Label-Efficient Deep Learning in Medical Image Analysis: Challenges and
Future Directions
- Authors: Cheng Jin, Zhengrui Guo, Yi Lin, Luyang Luo, Hao Chen
- Abstract summary: Training models in medical imaging analysis typically require expensive and time-consuming collection of labeled data.
We extensively investigated over 300 recent papers to provide a comprehensive overview of progress on label-efficient learning strategies in MIA.
Specifically, we provide an in-depth investigation, covering not only canonical semi-supervised, self-supervised, and multi-instance learning schemes, but also recently emerged active and annotation-efficient learning strategies.
- Score: 10.502964056448283
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep learning has seen rapid growth in recent years and achieved
state-of-the-art performance in a wide range of applications. However, training
models typically requires expensive and time-consuming collection of large
quantities of labeled data. This is particularly true within the scope of
medical imaging analysis (MIA), where data are limited and labels are expensive
to be acquired. Thus, label-efficient deep learning methods are developed to
make comprehensive use of the labeled data as well as the abundance of
unlabeled and weak-labeled data. In this survey, we extensively investigated
over 300 recent papers to provide a comprehensive overview of recent progress
on label-efficient learning strategies in MIA. We first present the background
of label-efficient learning and categorize the approaches into different
schemes. Next, we examine the current state-of-the-art methods in detail
through each scheme. Specifically, we provide an in-depth investigation,
covering not only canonical semi-supervised, self-supervised, and
multi-instance learning schemes, but also recently emerged active and
annotation-efficient learning strategies. Moreover, as a comprehensive
contribution to the field, this survey not only elucidates the commonalities
and unique features of the surveyed methods but also presents a detailed
analysis of the current challenges in the field and suggests potential avenues
for future research.
Related papers
- EP-SAM: Weakly Supervised Histopathology Segmentation via Enhanced Prompt with Segment Anything [3.760646312664378]
Pathological diagnosis of diseases like cancer has conventionally relied on the evaluation of morphological features by physicians and pathologists.
Recent advancements in compute-aided diagnosis (CAD) systems are gaining significant attention as diagnostic support tools.
We present a weakly supervised semantic segmentation (WSSS) model by combining class activation map and Segment Anything Model (SAM)-based pseudo-labeling.
arXiv Detail & Related papers (2024-10-17T14:55:09Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset.
Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive.
Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z) - Few-Shot Learning on Graphs: from Meta-learning to Pre-training and
Prompting [56.25730255038747]
This survey endeavors to synthesize recent developments, provide comparative insights, and identify future directions.
We systematically categorize existing studies into three major families: meta-learning approaches, pre-training approaches, and hybrid approaches.
We analyze the relationships among these methods and compare their strengths and limitations.
arXiv Detail & Related papers (2024-02-02T14:32:42Z) - Deep Learning for Multi-Label Learning: A Comprehensive Survey [6.571492336879553]
Multi-label learning is a rapidly growing research area that aims to predict multiple labels from a single input data point.
Inherent difficulties in MLC include dealing with high-dimensional data, addressing label correlations, and handling partial labels.
Recent years have witnessed a notable increase in adopting deep learning (DL) techniques to address these challenges more effectively in MLC.
arXiv Detail & Related papers (2024-01-29T20:37:03Z) - Semi-supervised Object Detection: A Survey on Recent Research and
Progress [2.2398477810999817]
Semi-supervised object detection (SSOD) has been paid more and more attentions due to its high research value and practicability.
We present a comprehensive and up-to-date survey on the SSOD approaches from five aspects.
arXiv Detail & Related papers (2023-06-25T02:54:03Z) - Active learning for data streams: a survey [0.48951183832371004]
Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream.
Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data.
This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in real time.
arXiv Detail & Related papers (2023-02-17T14:24:13Z) - Label-efficient Time Series Representation Learning: A Review [19.218833228063392]
Label-efficient time series representation learning is crucial for deploying deep learning models in real-world applications.
To address the scarcity of labeled time series data, various strategies, e.g., transfer learning, self-supervised learning, and semi-supervised learning, have been developed.
We introduce a novel taxonomy for the first time, categorizing existing approaches as in-domain or cross-domain, based on their reliance on external data sources.
arXiv Detail & Related papers (2023-02-13T15:12:15Z) - Efficient Medical Image Assessment via Self-supervised Learning [27.969767956918503]
High-performance deep learning methods typically rely on large annotated training datasets.
We propose a novel and efficient data assessment strategy to rank the quality of unlabeled medical image data.
Motivated by theoretical implication of SSL embedding space, we leverage a Masked Autoencoder for feature extraction.
arXiv Detail & Related papers (2022-09-28T21:39:00Z) - An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning [58.59343434538218]
We propose a simple but quite effective approach to predict accurate negative pseudo-labels of unlabeled data from an indirect learning perspective.
Our approach can be implemented in just few lines of code by only using off-the-shelf operations.
arXiv Detail & Related papers (2022-09-28T02:11:34Z) - Detecting Shortcut Learning for Fair Medical AI using Shortcut Testing [62.9062883851246]
Machine learning holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities.
One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data.
Using multi-task learning, we propose the first method to assess and mitigate shortcut learning as a part of the fairness assessment of clinical ML systems.
arXiv Detail & Related papers (2022-07-21T09:35:38Z) - From Labels to Priors in Capsule Endoscopy: A Prior Guided Approach for
Improving Generalization with Few Labels [4.9136996406481135]
We propose using freely available domain knowledge as priors to learn more robust and generalizable representations.
We experimentally show that domain priors can benefit representations by acting in proxy of labels.
Our method performs better than (or closes gap with) the state-of-the-art in the domain.
arXiv Detail & Related papers (2022-06-10T12:35:49Z) - Recent Few-Shot Object Detection Algorithms: A Survey with Performance
Comparison [54.357707168883024]
Few-Shot Object Detection (FSOD) mimics the humans' ability of learning to learn.
FSOD intelligently transfers the learned generic object knowledge from the common heavy-tailed, to the novel long-tailed object classes.
We give an overview of FSOD, including the problem definition, common datasets, and evaluation protocols.
arXiv Detail & Related papers (2022-03-27T04:11:28Z) - Federated Cycling (FedCy): Semi-supervised Federated Learning of
Surgical Phases [57.90226879210227]
FedCy is a semi-supervised learning (FSSL) method that combines FL and self-supervised learning to exploit a decentralized dataset of both labeled and unlabeled videos.
We demonstrate significant performance gains over state-of-the-art FSSL methods on the task of automatic recognition of surgical phases.
arXiv Detail & Related papers (2022-03-14T17:44:53Z) - Deep Long-Tailed Learning: A Survey [163.16874896812885]
Deep long-tailed learning aims to train well-performing deep models from a large number of images that follow a long-tailed class distribution.
Long-tailed class imbalance is a common problem in practical visual recognition tasks.
This paper provides a comprehensive survey on recent advances in deep long-tailed learning.
arXiv Detail & Related papers (2021-10-09T15:25:22Z) - DSAL: Deeply Supervised Active Learning from Strong and Weak Labelers
for Biomedical Image Segmentation [13.707848142719424]
We propose a deep active semi-supervised learning framework, DSAL, combining active learning and semi-supervised learning strategies.
In DSAL, a new criterion based on deep supervision mechanism is proposed to select informative samples with high uncertainties.
We use the proposed criteria to select samples for strong and weak labelers to produce oracle labels and pseudo labels simultaneously at each active learning iteration.
arXiv Detail & Related papers (2021-01-22T11:31:33Z) - The Emerging Trends of Multi-Label Learning [45.63795570392158]
Exabytes of data are generated daily by humans, leading to the growing need for new efforts in dealing with the grand challenges for multi-label learning brought by big data.
There is a lack of systemic studies that focus explicitly on analyzing the emerging trends and new challenges of multi-label learning in the era of big data.
It is imperative to call for a comprehensive survey to fulfill this mission and delineate future research directions and new applications.
arXiv Detail & Related papers (2020-11-23T03:36:00Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z) - Learning Image Labels On-the-fly for Training Robust Classification
Models [13.669654965671604]
We show how noisy annotations (e.g., from different algorithm-based labelers) can be utilized together and mutually benefit the learning of classification tasks.
A meta-training based label-sampling module is designed to attend the labels that benefit the model learning the most through additional back-propagation processes.
arXiv Detail & Related papers (2020-09-22T05:38:44Z) - Collaborative Unsupervised Domain Adaptation for Medical Image Diagnosis [102.40869566439514]
We seek to exploit rich labeled data from relevant domains to help the learning in the target task via Unsupervised Domain Adaptation (UDA)
Unlike most UDA methods that rely on clean labeled data or assume samples are equally transferable, we innovatively propose a Collaborative Unsupervised Domain Adaptation algorithm.
We theoretically analyze the generalization performance of the proposed method, and also empirically evaluate it on both medical and general images.
arXiv Detail & Related papers (2020-07-05T11:49:17Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.