Group Distributionally Robust Knowledge Distillation
- URL: http://arxiv.org/abs/2311.00476v1
- Date: Wed, 1 Nov 2023 12:25:02 GMT
- Title: Group Distributionally Robust Knowledge Distillation
- Authors: Konstantinos Vilouras, Xiao Liu, Pedro Sanchez, Alison Q. O'Neil,
Sotirios A. Tsaftaris
- Abstract summary: Sub-population shifts are a common scenario in medical imaging analysis.
In this paper, we propose a group-aware distillation loss.
We empirically validate our method, GroupDistil on two benchmark datasets.
- Score: 16.058258572636376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge distillation enables fast and effective transfer of features
learned from a bigger model to a smaller one. However, distillation objectives
are susceptible to sub-population shifts, a common scenario in medical imaging
analysis which refers to groups/domains of data that are underrepresented in
the training set. For instance, training models on health data acquired from
multiple scanners or hospitals can yield subpar performance for minority
groups. In this paper, inspired by distributionally robust optimization (DRO)
techniques, we address this shortcoming by proposing a group-aware distillation
loss. During optimization, a set of weights is updated based on the per-group
losses at a given iteration. This way, our method can dynamically focus on
groups that have low performance during training. We empirically validate our
method, GroupDistil on two benchmark datasets (natural images and cardiac MRIs)
and show consistent improvement in terms of worst-group accuracy.
Related papers
- Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm.
In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample.
In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Ranking & Reweighting Improves Group Distributional Robustness [14.021069321266516]
We propose a ranking-based training method called Discounted Rank Upweighting (DRU) to learn models that exhibit strong OOD performance on the test data.
Results on several synthetic and real-world datasets highlight the superior ability of our group-ranking-based (akin to soft-minimax) approach in selecting and learning models that are robust to group distributional shifts.
arXiv Detail & Related papers (2023-05-09T20:37:16Z) - Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group
Shifts [122.08782633878788]
Some robust training algorithms (e.g., Group DRO) specialize to group shifts and require group information on all training points.
Other methods (e.g., CVaR DRO) that do not need group annotations can be overly conservative.
We learn a model that maintains high accuracy on simple group functions realized by low features.
arXiv Detail & Related papers (2023-02-06T17:07:16Z) - Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain.
We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters.
We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z) - Take One Gram of Neural Features, Get Enhanced Group Robustness [23.541213868620837]
Predictive performance of machine learning models trained with empirical risk minimization can degrade considerably under distribution shifts.
We propose to partition the training dataset into groups based on Gram matrices of features extracted by an identification'' model.
Our approach not only improves group robustness over ERM but also outperforms all recent baselines.
arXiv Detail & Related papers (2022-08-26T12:34:55Z) - Density-Aware Personalized Training for Risk Prediction in Imbalanced
Medical Data [89.79617468457393]
Training models with imbalance rate (class density discrepancy) may lead to suboptimal prediction.
We propose a framework for training models for this imbalance issue.
We demonstrate our model's improved performance in real-world medical datasets.
arXiv Detail & Related papers (2022-07-23T00:39:53Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.