Artifact-Based Domain Generalization of Skin Lesion Models
- URL: http://arxiv.org/abs/2208.09756v1
- Date: Sat, 20 Aug 2022 22:25:09 GMT
- Title: Artifact-Based Domain Generalization of Skin Lesion Models
- Authors: Alceu Bissoto, Catarina Barata, Eduardo Valle, Sandra Avila
- Abstract summary: We propose a pipeline that relies on artifacts annotation to enable generalization evaluation and debiasing.
We create environments based on skin lesion artifacts to enable domain generalization methods.
Our results raise a concern that debiasing models towards a single aspect may not be enough for fair skin lesion analysis.
- Score: 20.792979998188848
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning failure cases are abundant, particularly in the medical area.
Recent studies in out-of-distribution generalization have advanced considerably
on well-controlled synthetic datasets, but they do not represent medical
imaging contexts. We propose a pipeline that relies on artifacts annotation to
enable generalization evaluation and debiasing for the challenging skin lesion
analysis context. First, we partition the data into levels of increasingly
higher biased training and test sets for better generalization assessment.
Then, we create environments based on skin lesion artifacts to enable domain
generalization methods. Finally, after robust training, we perform a test-time
debiasing procedure, reducing spurious features in inference images. Our
experiments show our pipeline improves performance metrics in biased cases, and
avoids artifacts when using explanation methods. Still, when evaluating such
models in out-of-distribution data, they did not prefer clinically-meaningful
features. Instead, performance only improved in test sets that present similar
artifacts from training, suggesting models learned to ignore the known set of
artifacts. Our results raise a concern that debiasing models towards a single
aspect may not be enough for fair skin lesion analysis.
Related papers
- Do Histopathological Foundation Models Eliminate Batch Effects? A Comparative Study [1.5142296396121897]
We show that the feature embeddings of the foundation models still contain distinct hospital signatures that can lead to biased predictions and misclassifications.
Our work provides a novel perspective on the evaluation of medical foundation models, paving the way for more robust pretraining strategies and downstream predictors.
arXiv Detail & Related papers (2024-11-08T11:39:03Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Test-Time Selection for Robust Skin Lesion Analysis [20.792979998188848]
Skin lesion analysis models are biased by artifacts placed during image acquisition.
We propose TTS (Test-Time Selection), a human-in-the-loop method that leverages positive (e.g., lesion area) and negative (e.g., artifacts) keypoints in test samples.
Our solution is robust to a varying availability of annotations, and different levels of bias.
arXiv Detail & Related papers (2023-08-10T14:08:50Z) - Realistic Data Enrichment for Robust Image Segmentation in
Histopathology [2.248423960136122]
We propose a new approach, based on diffusion models, which can enrich an imbalanced dataset with plausible examples from underrepresented groups.
Our method can simply expand limited clinical datasets making them suitable to train machine learning pipelines.
arXiv Detail & Related papers (2023-04-19T09:52:50Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - DeepTechnome: Mitigating Unknown Bias in Deep Learning Based Assessment
of CT Images [44.62475518267084]
We debias deep learning models during training against unknown bias.
We use control regions as surrogates that carry information regarding the bias.
Applying the proposed method to learn from data exhibiting a strong bias, it near-perfectly recovers the classification performance observed when training with corresponding unbiased data.
arXiv Detail & Related papers (2022-05-26T12:18:48Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Transductive image segmentation: Self-training and effect of uncertainty
estimation [16.609998086075127]
Semi-supervised learning (SSL) uses unlabeled data during training to learn better models.
This study focuses on the quality of predictions made on the unlabeled data of interest when they are included for optimization during training, rather than improving generalization.
Our experiments on a large MRI database for multi-class segmentation of traumatic brain lesions shows promising results when comparing transductive with inductive predictions.
arXiv Detail & Related papers (2021-07-19T15:26:07Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.