Variability Matters : Evaluating inter-rater variability in
histopathology for robust cell detection
- URL: http://arxiv.org/abs/2210.05175v1
- Date: Tue, 11 Oct 2022 06:24:55 GMT
- Title: Variability Matters : Evaluating inter-rater variability in
histopathology for robust cell detection
- Authors: Cholmin Kang, Chunggi Lee, Heon Song, Minuk Ma and S ergio Pereira
- Abstract summary: We present a large-scale study on the variability of cell annotations among 120 board-certified pathologists.
We show that increasing the data size at the expense of inter-rater variability does not necessarily lead to better-performing models in cell detection.
These findings suggest that the evaluation of the annotators may help tackle the fundamental budget issues in the histopathology domain.
- Score: 3.2873782624127843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large annotated datasets have been a key component in the success of deep
learning. However, annotating medical images is challenging as it requires
expertise and a large budget. In particular, annotating different types of
cells in histopathology suffer from high inter- and intra-rater variability due
to the ambiguity of the task. Under this setting, the relation between
annotators' variability and model performance has received little attention. We
present a large-scale study on the variability of cell annotations among 120
board-certified pathologists and how it affects the performance of a deep
learning model. We propose a method to measure such variability, and by
excluding those annotators with low variability, we verify the trade-off
between the amount of data and its quality. We found that naively increasing
the data size at the expense of inter-rater variability does not necessarily
lead to better-performing models in cell detection. Instead, decreasing the
inter-rater variability with the expense of decreasing dataset size increased
the model performance. Furthermore, models trained from data annotated with
lower inter-labeler variability outperform those from higher inter-labeler
variability. These findings suggest that the evaluation of the annotators may
help tackle the fundamental budget issues in the histopathology domain
Related papers
- Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech [60.08015780474457]
Alzheimer's Disease (AD) detection has emerged as a promising research area that employs machine learning classification models.
We identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments.
We propose two novel methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively.
arXiv Detail & Related papers (2024-09-22T02:06:05Z) - PULASki: Learning inter-rater variability using statistical distances to
improve probabilistic segmentation [36.136619420474766]
We propose the PULASki for biomedical image segmentation that accurately captures variability in expert annotations.
Our approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure.
Our method can also be applied to a wide range of multi-label segmentation tasks and is useful for downstream tasks such as hemodynamic modelling.
arXiv Detail & Related papers (2023-12-25T10:31:22Z) - Variable Importance in High-Dimensional Settings Requires Grouping [19.095605415846187]
Conditional Permutation Importance (CPI) bypasses PI's limitations in such cases.
Grouping variables statistically via clustering or some prior knowledge gains some power back.
We show that the approach extended with stacking controls the type-I error even with highly-correlated groups.
arXiv Detail & Related papers (2023-12-18T00:21:47Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - How inter-rater variability relates to aleatoric and epistemic
uncertainty: a case study with deep learning-based paraspinal muscle
segmentation [1.9624082208594296]
We study how inter-rater variability affects the reliability of the resulting deep learning algorithms.
Our study reveals the interplay between inter-rater variability and uncertainties, affected by choices of label fusion strategies and DL models.
arXiv Detail & Related papers (2023-08-14T06:40:20Z) - Rethinking Mitosis Detection: Towards Diverse Data and Feature
Representation [30.882319057927052]
We propose a novel generalizable framework (MitDet) for mitosis detection.
Our proposed model outperforms all the SOTA approaches in several popular mitosis detection datasets.
arXiv Detail & Related papers (2023-07-12T03:33:11Z) - Analyzing the Effects of Handling Data Imbalance on Learned Features
from Medical Images by Looking Into the Models [50.537859423741644]
Training a model on an imbalanced dataset can introduce unique challenges to the learning problem.
We look deeper into the internal units of neural networks to observe how handling data imbalance affects the learned features.
arXiv Detail & Related papers (2022-04-04T09:38:38Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Relational Subsets Knowledge Distillation for Long-tailed Retinal
Diseases Recognition [65.77962788209103]
We propose class subset learning by dividing the long-tailed data into multiple class subsets according to prior knowledge.
It enforces the model to focus on learning the subset-specific knowledge.
The proposed framework proved to be effective for the long-tailed retinal diseases recognition task.
arXiv Detail & Related papers (2021-04-22T13:39:33Z) - Embracing the Disharmony in Heterogeneous Medical Data [12.739380441313022]
Heterogeneity in medical imaging data is often tackled, in the context of machine learning, using domain invariance.
This paper instead embraces the heterogeneity and treats it as a multi-task learning problem.
We show that this approach improves classification accuracy by 5-30 % across different datasets on the main classification tasks.
arXiv Detail & Related papers (2021-03-23T21:36:39Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.