On the Robustness Tradeoff in Fine-Tuning
- URL: http://arxiv.org/abs/2503.14836v2
- Date: Mon, 14 Jul 2025 17:14:45 GMT
- Title: On the Robustness Tradeoff in Fine-Tuning
- Authors: Kunyang Li, Jean-Charles Noirot Ferrand, Ryan Sheatsley, Blaine Hoak, Yohan Beugin, Eric Pauley, Patrick McDaniel,
- Abstract summary: We evaluate the robustness and accuracy of fine-tuned models over 6 benchmark datasets.<n>We observe a consistent trade-off between adversarial robustness and accuracy.<n>These insights emphasize the need for robustness-aware fine-tuning to ensure reliable real-world deployments.
- Score: 5.95580243805785
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fine-tuning has become the standard practice for adapting pre-trained models to downstream tasks. However, the impact on model robustness is not well understood. In this work, we characterize the robustness-accuracy trade-off in fine-tuning. We evaluate the robustness and accuracy of fine-tuned models over 6 benchmark datasets and 7 different fine-tuning strategies. We observe a consistent trade-off between adversarial robustness and accuracy. Peripheral updates such as BitFit are more effective for simple tasks -- over 75% above the average measured by the area under the Pareto frontiers on CIFAR-10 and CIFAR-100. In contrast, fine-tuning information-heavy layers, such as attention layers via Compacter, achieves a better Pareto frontier on more complex tasks -- 57.5% and 34.6% above the average on Caltech-256 and CUB-200, respectively. Lastly, we observe that the robustness of fine-tuning against out-of-distribution data closely tracks accuracy. These insights emphasize the need for robustness-aware fine-tuning to ensure reliable real-world deployments.
Related papers
- Adversarial Robustness of Traffic Classification under Resource Constraints: Input Structure Matters [2.4779082385578337]
Traffic classification (TC) plays a critical role in cybersecurity, particularly in IoT and embedded contexts.<n>We use hardware-aware neural architecture search (HW-NAS) to derive lightweight TC models that are accurate, efficient, and deployable on edge platforms.
arXiv Detail & Related papers (2025-12-01T23:47:22Z) - Efficient Adversarial Malware Defense via Trust-Based Raw Override and Confidence-Adaptive Bit-Depth Reduction [0.0]
Recent advances in adversarial defenses have demonstrated strong robustness improvements.<n> computational overhead ranging from 4x to 22x presents significant challenges for production systems processing millions of samples daily.<n>We propose a novel framework that combines Trust-Adaptive TRO with Confidence-Adaptive Bit-Depth Reduction.<n>Our approach achieves 1.76x computational overhead - a 2.3x improvement over state-of-the-art smoothing defenses.
arXiv Detail & Related papers (2025-11-16T23:21:44Z) - Robust Fine-tuning of Zero-shot Models via Variance Reduction [56.360865951192324]
When fine-tuning zero-shot models, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD)
We propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.
arXiv Detail & Related papers (2024-11-11T13:13:39Z) - Accurate and Reliable Predictions with Mutual-Transport Ensemble [46.368395985214875]
We propose a co-trained auxiliary model and adaptively regularizes the cross-entropy loss using Kullback-Leibler (KL)
We show that MTE can simultaneously enhance both accuracy and uncertainty calibration.
For example, on the CIFAR-100 dataset, our MTE method on ResNet34/50 achieved significant improvements compared to previous state-of-the-art method.
arXiv Detail & Related papers (2024-05-30T03:15:59Z) - MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers [41.56951365163419]
"MixedNUTS" is a training-free method where the output logits of a robust classifier are processed by nonlinear transformations with only three parameters.
MixedNUTS then converts the transformed logits into probabilities and mixes them as the overall output.
On CIFAR-10, CIFAR-100, and ImageNet datasets, experimental results with custom strong adaptive attacks demonstrate MixedNUTS's vastly improved accuracy and near-SOTA robustness.
arXiv Detail & Related papers (2024-02-03T21:12:36Z) - Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line [65.14099135546594]
Recent test-time adaptation (TTA) methods drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before.
Our results show that by combining TTA with AGL-based estimation methods, we can estimate the OOD performance of models with high precision for a broader set of distribution shifts.
arXiv Detail & Related papers (2023-10-07T23:21:25Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Improving the Accuracy-Robustness Trade-Off of Classifiers via Adaptive Smoothing [9.637143119088426]
We show that a robust base classifier's confidence difference for correct and incorrect examples is the key to this improvement.
We adapt an adversarial input detector into a mixing network that adaptively adjusts the mixture of the two base models.
The proposed flexible method, termed "adaptive smoothing", can work in conjunction with existing or even future methods that improve clean accuracy, robustness, or adversary detection.
arXiv Detail & Related papers (2023-01-29T22:05:28Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Robust fine-tuning of zero-shot models [79.38373024475646]
Existing fine-tuning approaches substantially improve accuracy in-distribution, but reduce out-of-distribution robustness.
We introduce a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models.
Compared to standard fine-tuning, the resulting weight-space ensembles provide large accuracy improvements out-of-distribution, while matching or improving in-distribution accuracy.
arXiv Detail & Related papers (2021-09-04T17:11:28Z) - Adversarial Feature Stacking for Accurate and Robust Predictions [4.208059346198116]
Adversarial Feature Stacking (AFS) model can jointly take advantage of features with varied levels of robustness and accuracy.
We evaluate the AFS model on CIFAR-10 and CIFAR-100 datasets with strong adaptive attack methods.
arXiv Detail & Related papers (2021-03-24T12:01:24Z) - Adversarial Concurrent Training: Optimizing Robustness and Accuracy
Trade-off of Deep Neural Networks [13.041607703862724]
We propose Adversarial Concurrent Training (ACT) to train a robust model in conjunction with a natural model in a minimax game.
ACT achieves 68.20% standard accuracy and 44.29% robustness accuracy under a 100-iteration untargeted attack.
arXiv Detail & Related papers (2020-08-16T22:14:48Z) - Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning [134.15174177472807]
We introduce adversarial training into self-supervision, to provide general-purpose robust pre-trained models for the first time.
We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins.
arXiv Detail & Related papers (2020-03-28T18:28:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.