Information Gain Sampling for Active Learning in Medical Image
Classification
- URL: http://arxiv.org/abs/2208.00974v1
- Date: Mon, 1 Aug 2022 16:25:53 GMT
- Title: Information Gain Sampling for Active Learning in Medical Image
Classification
- Authors: Raghav Mehta, Changjian Shui, Brennan Nichyporuk, Tal Arbel
- Abstract summary: This work presents an information-theoretic active learning framework that guides the optimal selection of images from the unlabelled pool to be labeled.
Experiments are performed on two different medical image classification datasets.
- Score: 3.1619162190378787
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large, annotated datasets are not widely available in medical image analysis
due to the prohibitive time, costs, and challenges associated with labelling
large datasets. Unlabelled datasets are easier to obtain, and in many contexts,
it would be feasible for an expert to provide labels for a small subset of
images. This work presents an information-theoretic active learning framework
that guides the optimal selection of images from the unlabelled pool to be
labeled based on maximizing the expected information gain (EIG) on an
evaluation dataset. Experiments are performed on two different medical image
classification datasets: multi-class diabetic retinopathy disease scale
classification and multi-class skin lesion classification. Results indicate
that by adapting EIG to account for class-imbalances, our proposed Adapted
Expected Information Gain (AEIG) outperforms several popular baselines
including the diversity based CoreSet and uncertainty based maximum entropy
sampling. Specifically, AEIG achieves ~95% of overall performance with only 19%
of the training data, while other active learning approaches require around
25%. We show that, by careful design choices, our model can be integrated into
existing deep learning classifiers.
Related papers
- Additional Look into GAN-based Augmentation for Deep Learning COVID-19
Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples.
We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems.
The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - A Knowledge Distillation framework for Multi-Organ Segmentation of
Medaka Fish in Tomographic Image [5.881800919492064]
We propose a self-training framework for multi-organ segmentation in tomographic images of Medaka fish.
We utilize the pseudo-labeled data from a pretrained model and adopt a Quality Teacher to refine the pseudo-labeled data.
The experimental results demonstrate that our method improves mean Intersection over Union (IoU) by 5.9% on the full dataset.
arXiv Detail & Related papers (2023-02-24T10:31:29Z) - Data Augmentation using Feature Generation for Volumetric Medical Images [0.08594140167290097]
Medical image classification is one of the most critical problems in the image recognition area.
One of the major challenges in this field is the scarcity of labelled training data.
Deep Learning models, in particular, show promising results on image segmentation and classification problems.
arXiv Detail & Related papers (2022-09-28T13:46:24Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Deep reinforced active learning for multi-class image classification [0.0]
High accuracy medical image classification can be limited by the costs of acquiring more data as well as the time and expertise needed to label existing images.
We apply active learning to medical image classification, a method which aims to maximise model performance on a minimal subset from a larger pool of data.
arXiv Detail & Related papers (2022-06-20T09:30:55Z) - Application of Transfer Learning and Ensemble Learning in Image-level
Classification for Breast Histopathology [9.037868656840736]
In Computer-Aided Diagnosis (CAD), traditional classification models mostly use a single network to extract features.
This paper proposes a deep ensemble model based on image-level labels for the binary classification of benign and malignant lesions.
Result: In the ensemble network model with accuracy as the weight, the image-level binary classification achieves an accuracy of $98.90%$.
arXiv Detail & Related papers (2022-04-18T13:31:53Z) - Improving Contrastive Learning on Imbalanced Seed Data via Open-World
Sampling [96.8742582581744]
We present an open-world unlabeled data sampling framework called Model-Aware K-center (MAK)
MAK follows three simple principles: tailness, proximity, and diversity.
We demonstrate that MAK can consistently improve both the overall representation quality and the class balancedness of the learned features.
arXiv Detail & Related papers (2021-11-01T15:09:41Z) - A Real Use Case of Semi-Supervised Learning for Mammogram Classification
in a Local Clinic of Costa Rica [0.5541644538483946]
Training a deep learning model requires a considerable amount of labeled images.
A number of publicly available datasets have been built with data from different hospitals and clinics.
The use of the semi-supervised deep learning approach known as MixMatch, to leverage the usage of unlabeled data is proposed and evaluated.
arXiv Detail & Related papers (2021-07-24T22:26:50Z) - ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised
Medical Image Segmentation [99.90263375737362]
We propose ATSO, an asynchronous version of teacher-student optimization.
ATSO partitions the unlabeled data into two subsets and alternately uses one subset to fine-tune the model and updates the label on the other subset.
We evaluate ATSO on two popular medical image segmentation datasets and show its superior performance in various semi-supervised settings.
arXiv Detail & Related papers (2020-06-24T04:05:12Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.