Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness
- URL: http://arxiv.org/abs/2503.09487v2
- Date: Thu, 20 Mar 2025 14:58:40 GMT
- Title: Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness
- Authors: Beier Zhu, Jiequan Cui, Hanwang Zhang, Chi Zhang,
- Abstract summary: We propose a three-step approach for parameter-efficient fine-tuning of image-text foundation models.<n>Our method improves its two key components: minority samples identification and the robust training algorithm.<n>Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error.
- Score: 53.96714099151378
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While image-text foundation models have succeeded across diverse downstream tasks, they still face challenges in the presence of spurious correlations between the input and label. To address this issue, we propose a simple three-step approach,Project-Probe-Aggregate (PPA), that enables parameter-efficient fine-tuning for foundation models without relying on group annotations. Building upon the failure-based debiasing scheme, our method, PPA, improves its two key components: minority samples identification and the robust training algorithm. Specifically, we first train biased classifiers by projecting image features onto the nullspace of class proxies from text encoders. Next, we infer group labels using the biased classifier and probe group targets with prior correction. Finally, we aggregate group weights of each class to produce the debiased classifier. Our theoretical analysis shows that our PPA enhances minority group identification and is Bayes optimal for minimizing the balanced group error, mitigating spurious correlations. Extensive experimental results confirm the effectiveness of our PPA: it outperforms the state-of-the-art by an average worst-group accuracy while requiring less than 0.01% tunable parameters without training group labels.
Related papers
- Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target.
This can degrade the performance on underrepresented (or'minority') groups that lack these attributes.
We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z) - The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations.
In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy.
Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z) - Appeal: Allow Mislabeled Samples the Chance to be Rectified in Partial Label Learning [55.4510979153023]
In partial label learning (PLL), each instance is associated with a set of candidate labels among which only one is ground-truth.
To help these mislabeled samples "appeal," we propose the first appeal-based framework.
arXiv Detail & Related papers (2023-12-18T09:09:52Z) - Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm.
In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample.
In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Improved Group Robustness via Classifier Retraining on Independent
Splits [6.930560177764658]
Group distributionally robust optimization is a widely used baseline for learning models with strong worst-group performance.
This paper designs a simple method based on the idea of retraining on independent splits of the training data.
We find that using a novel sample-splitting procedure achieves robust worst-group performance in the fine-tuning step.
arXiv Detail & Related papers (2022-04-20T16:22:27Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.