Label Cleaning Multiple Instance Learning: Refining Coarse Annotations
on Single Whole-Slide Images
- URL: http://arxiv.org/abs/2109.10778v1
- Date: Wed, 22 Sep 2021 15:06:06 GMT
- Title: Label Cleaning Multiple Instance Learning: Refining Coarse Annotations
on Single Whole-Slide Images
- Authors: Zhenzhen Wang, Aleksander S. Popel, Jeremias Sulam
- Abstract summary: Annotating cancerous regions in whole-slide images (WSIs) of pathology samples plays a critical role in clinical diagnosis, biomedical research, and machine learning algorithms development.
We present a method, named Label Cleaning Multiple Instance Learning (LC-MIL), to refine coarse annotations on a single WSI without the need of external training data.
Our experiments on a heterogeneous WSI set with breast cancer lymph node metastasis, liver cancer, and colorectal cancer samples show that LC-MIL significantly refines the coarse annotations, outperforming the state-of-the-art alternatives, even while learning from a single slide.
- Score: 83.7047542725469
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Annotating cancerous regions in whole-slide images (WSIs) of pathology
samples plays a critical role in clinical diagnosis, biomedical research, and
machine learning algorithms development. However, generating exhaustive and
accurate annotations is labor-intensive, challenging, and costly. Drawing only
coarse and approximate annotations is a much easier task, less costly, and it
alleviates pathologists' workload. In this paper, we study the problem of
refining these approximate annotations in digital pathology to obtain more
accurate ones. Some previous works have explored obtaining machine learning
models from these inaccurate annotations, but few of them tackle the refinement
problem where the mislabeled regions should be explicitly identified and
corrected, and all of them require a - often very large - number of training
samples. We present a method, named Label Cleaning Multiple Instance Learning
(LC-MIL), to refine coarse annotations on a single WSI without the need of
external training data. Patches cropped from a WSI with inaccurate labels are
processed jointly with a MIL framework, and a deep-attention mechanism is
leveraged to discriminate mislabeled instances, mitigating their impact on the
predictive model and refining the segmentation. Our experiments on a
heterogeneous WSI set with breast cancer lymph node metastasis, liver cancer,
and colorectal cancer samples show that LC-MIL significantly refines the coarse
annotations, outperforming the state-of-the-art alternatives, even while
learning from a single slide. These results demonstrate the LC-MIL is a
promising, lightweight tool to provide fine-grained annotations from coarsely
annotated pathology sets.
Related papers
- Semi- and Weakly-Supervised Learning for Mammogram Mass Segmentation with Limited Annotations [49.33388736227072]
We propose a semi- and weakly-supervised learning framework for mass segmentation.
We use limited strongly-labeled samples and sufficient weakly-labeled samples to achieve satisfactory performance.
Experiments on CBIS-DDSM and INbreast datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-03-14T12:05:25Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites:
A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area.
We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions.
We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z) - Active Learning Enhances Classification of Histopathology Whole Slide
Images with Attention-based Multiple Instance Learning [48.02011627390706]
We train an attention-based MIL and calculate a confidence metric for every image in the dataset to select the most uncertain WSIs for expert annotation.
With a novel attention guiding loss, this leads to an accuracy boost of the trained models with few regions annotated for each class.
It may in the future serve as an important contribution to train MIL models in the clinically relevant context of cancer classification in histopathology.
arXiv Detail & Related papers (2023-03-02T15:18:58Z) - Self-Supervised Equivariant Regularization Reconciles Multiple Instance
Learning: Joint Referable Diabetic Retinopathy Classification and Lesion
Segmentation [3.1671604920729224]
Lesion appearance is a crucial clue for medical providers to distinguish referable diabetic retinopathy (rDR) from non-referable DR.
Most existing large-scale DR datasets contain only image-level labels rather than pixel-based annotations.
This paper leverages self-supervised equivariant learning and attention-based multi-instance learning to tackle this problem.
We conduct extensive validation experiments on the Eyepacs dataset, achieving an area under the receiver operating characteristic curve (AU ROC) of 0.958, outperforming current state-of-the-art algorithms.
arXiv Detail & Related papers (2022-10-12T06:26:05Z) - Weakly Supervised Medical Image Segmentation With Soft Labels and Noise
Robust Loss [0.16490701092527607]
Training deep learning models commonly requires large datasets with expert-labeled annotations.
Image-based medical diagnosis tools using deep learning models trained with incorrect segmentation labels can lead to false diagnoses and treatment suggestions.
The aim of this paper was to develop and evaluate a method to generate probabilistic labels based on multi-rater annotations and anatomical knowledge of the lesion features in MRI.
arXiv Detail & Related papers (2022-09-16T21:07:59Z) - Weakly-Supervised Cross-Domain Adaptation for Endoscopic Lesions
Segmentation [79.58311369297635]
We propose a new weakly-supervised lesions transfer framework, which can explore transferable domain-invariant knowledge across different datasets.
A Wasserstein quantified transferability framework is developed to highlight widerange transferable contextual dependencies.
A novel self-supervised pseudo label generator is designed to equally provide confident pseudo pixel labels for both hard-to-transfer and easy-to-transfer target samples.
arXiv Detail & Related papers (2020-12-08T02:26:03Z) - Learning Image Labels On-the-fly for Training Robust Classification
Models [13.669654965671604]
We show how noisy annotations (e.g., from different algorithm-based labelers) can be utilized together and mutually benefit the learning of classification tasks.
A meta-training based label-sampling module is designed to attend the labels that benefit the model learning the most through additional back-propagation processes.
arXiv Detail & Related papers (2020-09-22T05:38:44Z) - Renal Cell Carcinoma Detection and Subtyping with Minimal Point-Based
Annotation in Whole-Slide Images [3.488702792183152]
It is much easier and cheaper to get unlabeled data from whole-slide images.
Semi-supervised learning (SSL) is an effective way to utilize unlabeled data.
We propose a framework that employs an SSL method to accurately detect cancerous regions.
arXiv Detail & Related papers (2020-08-12T14:12:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.