Active Learning in Brain Tumor Segmentation with Uncertainty Sampling,
Annotation Redundancy Restriction, and Data Initialization
- URL: http://arxiv.org/abs/2302.10185v1
- Date: Sun, 5 Feb 2023 04:45:08 GMT
- Title: Active Learning in Brain Tumor Segmentation with Uncertainty Sampling,
Annotation Redundancy Restriction, and Data Initialization
- Authors: Daniel D Kim, Rajat S Chandra, Jian Peng, Jing Wu, Xue Feng, Michael
Atalay, Chetan Bettegowda, Craig Jones, Haris Sair, Wei-hua Liao, Chengzhang
Zhu, Beiji Zou, Li Yang, Anahita Fathi Kazerooni, Ali Nabavizadeh, Harrison X
Bai, Zhicheng Jiao
- Abstract summary: Deep learning models have demonstrated great potential in medical 3D imaging, but their development is limited by the expensive, large volume of annotated data required.
Active learning (AL) addresses this by training a model on a subset of the most informative data samples without compromising performance.
We compared different AL strategies and propose a framework that minimizes the amount of data needed for state-of-the-art performance.
- Score: 17.3513750927719
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models have demonstrated great potential in medical 3D imaging,
but their development is limited by the expensive, large volume of annotated
data required. Active learning (AL) addresses this by training a model on a
subset of the most informative data samples without compromising performance.
We compared different AL strategies and propose a framework that minimizes the
amount of data needed for state-of-the-art performance. 638 multi-institutional
brain tumor MRI images were used to train a 3D U-net model and compare AL
strategies. We investigated uncertainty sampling, annotation redundancy
restriction, and initial dataset selection techniques. Uncertainty estimation
techniques including Bayesian estimation with dropout, bootstrapping, and
margins sampling were compared to random query. Strategies to avoid annotation
redundancy by removing similar images within the to-be-annotated subset were
considered as well. We determined the minimum amount of data necessary to
achieve similar performance to the model trained on the full dataset ({\alpha}
= 0.1). A variance-based selection strategy using radiomics to identify the
initial training dataset is also proposed. Bayesian approximation with dropout
at training and testing showed similar results to that of the full data model
with less than 20% of the training data (p=0.293) compared to random query
achieving similar performance at 56.5% of the training data (p=0.814).
Annotation redundancy restriction techniques achieved state-of-the-art
performance at approximately 40%-50% of the training data. Radiomics dataset
initialization had higher Dice with initial dataset sizes of 20 and 80 images,
but improvements were not significant. In conclusion, we investigated various
AL strategies with dropout uncertainty estimation achieving state-of-the-art
performance with the least annotated data.
Related papers
- Just How Flexible are Neural Networks in Practice? [89.80474583606242]
It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters.
In practice, however, we only find solutions via our training procedure, including the gradient and regularizers, limiting flexibility.
arXiv Detail & Related papers (2024-06-17T12:24:45Z) - From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search [19.070305201045954]
In text-based person search endeavors, data generation has emerged as a prevailing practice, addressing concerns over privacy preservation and the arduous task of manual annotation.
We observe that only a subset of the data in constructed datasets plays a decisive role.
We introduce a new Filtering-WoRA paradigm, which contains a filtering algorithm to identify this crucial data subset and WoRA learning strategy for light fine-tuning.
arXiv Detail & Related papers (2024-04-16T05:29:14Z) - How to Train Data-Efficient LLMs [56.41105687693619]
We study data-efficient approaches for pre-training language models (LLMs)
We find that Ask-LLM and Density sampling are the best methods in their respective categories.
In our comparison of 19 samplers, involving hundreds of evaluation tasks and pre-training runs, we find that Ask-LLM and Density are the best methods in their respective categories.
arXiv Detail & Related papers (2024-02-15T02:27:57Z) - Group Distributionally Robust Dataset Distillation with Risk
Minimization [18.07189444450016]
We introduce an algorithm that combines clustering with the minimization of a risk measure on the loss to conduct DD.
We demonstrate its effective generalization and robustness across subgroups through numerical experiments.
arXiv Detail & Related papers (2024-02-07T09:03:04Z) - Semantically Redundant Training Data Removal and Deep Model
Classification Performance: A Study with Chest X-rays [5.454938535500864]
We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data.
We demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set.
arXiv Detail & Related papers (2023-09-18T13:56:34Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance.
We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset.
We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z) - Dataset Pruning: Reducing Training Data by Examining Generalization
Influence [30.30255670341501]
Do all training data contribute to model's performance?
How to construct a smallest subset from the entire training data as a proxy training set without significantly sacrificing the model's performance?
arXiv Detail & Related papers (2022-05-19T05:36:35Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Harnessing Unlabeled Data to Improve Generalization of Biometric Gender
and Age Classifiers [0.7874708385247353]
Deep learning models need large amount of labeled data for model training and optimum parameter estimation.
Due to privacy and security concerns, the large amount of labeled data could not be collected for certain applications such as those involving medical field.
We propose self-ensemble based deep learning model that along with limited labeled data, harness unlabeled data for improving the generalization performance.
arXiv Detail & Related papers (2021-10-09T01:06:01Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.