Related papers: The effect of variable labels on deep learning models trained to predict breast density

The effect of variable labels on deep learning models trained to predict breast density

URL: http://arxiv.org/abs/2210.04106v1
Date: Sat, 8 Oct 2022 21:18:05 GMT
Title: The effect of variable labels on deep learning models trained to predict breast density
Authors: Steven Squires, Elaine F. Harkness, D. Gareth Evans and Susan M. Astley
Abstract summary: High breast density is associated with reduced efficacy of mammographic screening and increased risk of developing breast cancer. Expert reader assessments of density show a strong relationship to cancer risk but also inter-reader variation. The effect of label variability on model performance is important when considering how to utilise automated methods for both research and clinical purposes.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Purpose: High breast density is associated with reduced efficacy of mammographic screening and increased risk of developing breast cancer. Accurate and reliable automated density estimates can be used for direct risk prediction and passing density related information to further predictive models. Expert reader assessments of density show a strong relationship to cancer risk but also inter-reader variation. The effect of label variability on model performance is important when considering how to utilise automated methods for both research and clinical purposes. Methods: We utilise subsets of images with density labels to train a deep transfer learning model which is used to assess how label variability affects the mapping from representation to prediction. We then create two end-to-end deep learning models which allow us to investigate the effect of label variability on the model representation formed. Results: We show that the trained mappings from representations to labels are altered considerably by the variability of reader scores. Training on labels with distribution variation removed causes the Spearman rank correlation coefficients to rise from $0.751\pm0.002$ to either $0.815\pm0.006$ when averaging across readers or $0.844\pm0.002$ when averaging across images. However, when we train different models to investigate the representation effect we see little difference, with Spearman rank correlation coefficients of $0.846\pm0.006$ and $0.850\pm0.006$ showing no statistically significant difference in the quality of the model representation with regard to density prediction. Conclusions: We show that the mapping between representation and mammographic density prediction is significantly affected by label variability. However, the effect of the label variability on the model representation is limited.

Related papers

GRASP-PsONet: Gradient-based Removal of Spurious Patterns for PsOriasis Severity Classification [0.0]
We propose a framework to automatically flag problematic training images that introduce spurious correlations.<n>Removing 8.2% of flagged images improves model AUC-ROC by 5% (85% to 90%) on a held out test set.<n>When applied to a subset of training data rated by two dermatologists, the method identifies over 90% of cases with inter-rater disagreement.
arXiv Detail & Related papers (2025-06-27T03:42:09Z)
Predicting Genetic Mutations from Single-Cell Bone Marrow Images in Acute Myeloid Leukemia Using Noise-Robust Deep Learning Models [2.6995203611040455]
We propose a robust methodology for identification of myeloid blasts followed by prediction of genetic mutation in single-cell images of blasts.<n>We trained an initial binary classifier to distinguish between leukemic (blasts) and non-leukemic cells images, achieving 90 percent accuracy.<n>Despite the tumor label noise, our mutation classification model achieved 85 percent accuracy across four mutation classes, demonstrating resilience to label inconsistencies.
arXiv Detail & Related papers (2025-06-15T10:15:42Z)
An analysis of data variation and bias in image-based dermatological datasets for machine learning classification [2.039829968340841]
In clinical dermatology, classification models can detect malignant lesions on patients' skin using only RGB images as input. Most learning-based methods employ data acquired from dermoscopic datasets on training, which are large and validated by a gold standard. This work aims to evaluate the gap between dermoscopic and clinical samples and understand how the dataset variations impact training.
arXiv Detail & Related papers (2025-01-15T17:18:46Z)
Exploring Data Augmentations on Self-/Semi-/Fully- Supervised Pre-trained Models [24.376036129920948]
We investigate how data augmentation affects performance of vision pre-trained models. We apply 4 types of data augmentations termed with Random Erasing, CutOut, CutMix and MixUp. We report their performance on vision tasks such as image classification, object detection, instance segmentation, and semantic segmentation.
arXiv Detail & Related papers (2023-10-28T23:46:31Z)
Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks [4.213427823201119]
Models with similar performances exhibit significant disagreement in the predictions of individual samples, referred to as prediction churn. We propose a novel metric called Influence Difference (ID) to quantify the variation in reasons used by nodes across models. We also consider the differences between nodes with a stable and an unstable prediction, positing that both equally utilize different reasons. As an efficient approximation, we introduce DropDistillation (DD) that matches the output for a graph perturbed by edge deletions.
arXiv Detail & Related papers (2023-10-02T07:37:28Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Semi-Supervised Deep Regression with Uncertainty Consistency and Variational Model Ensembling via Bayesian Neural Networks [31.67508478764597]
We propose a novel approach to semi-supervised regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME) Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression. Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels.
arXiv Detail & Related papers (2023-02-15T10:40:51Z)
Variability Matters : Evaluating inter-rater variability in histopathology for robust cell detection [3.2873782624127843]
We present a large-scale study on the variability of cell annotations among 120 board-certified pathologists. We show that increasing the data size at the expense of inter-rater variability does not necessarily lead to better-performing models in cell detection. These findings suggest that the evaluation of the annotators may help tackle the fundamental budget issues in the histopathology domain.
arXiv Detail & Related papers (2022-10-11T06:24:55Z)
Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation [32.66196135141696]
Features with varying relationships to the label are nuisances. Models that exploit nuisance-label relationships face performance degradation when these relationships change. We develop an approach to use knowledge about the semantics by corrupting them in data.
arXiv Detail & Related papers (2022-10-04T01:40:31Z)
On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data. Invariance measures consistency of model predictions on transformations of the data. From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z)
X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning. To take the power of both worlds, we propose a novel X-model. X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z)
Label Distribution Amendment with Emotional Semantic Correlations for Facial Expression Recognition [69.18918567657757]
We propose a new method that amends the label distribution of each facial image by leveraging correlations among expressions in the semantic space. By comparing semantic and task class-relation graphs of each image, the confidence of its label distribution is evaluated. Experimental results demonstrate the proposed method is more effective than compared state-of-the-art methods.
arXiv Detail & Related papers (2021-07-23T07:46:14Z)
Learning Disentangled Representations with Latent Variation Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations. Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs. We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
Semi-supervised Medical Image Classification with Relation-driven Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification. It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations. Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.