A Fast and Efficient Conditional Learning for Tunable Trade-Off between
Accuracy and Robustness
- URL: http://arxiv.org/abs/2204.00426v1
- Date: Mon, 28 Mar 2022 19:25:36 GMT
- Title: A Fast and Efficient Conditional Learning for Tunable Trade-Off between
Accuracy and Robustness
- Authors: Souvik Kundu, Sairam Sundaresan, Massoud Pedram, Peter A. Beerel
- Abstract summary: Existing models that achieve state-of-the-art (SOTA) performance on both clean and adversarially-perturbed images rely on convolution operations conditioned with feature-wise linear modulation (FiLM) layers.
We present a fast learnable once-for-all adversarial training (FLOAT) algorithm, which instead of the existing FiLM-based conditioning, presents a unique weight conditioned learning that requires no additional layer.
In particular, we add scaled noise to the weight tensors that enables a trade-off between clean and adversarial performance.
- Score: 11.35810118757863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing models that achieve state-of-the-art (SOTA) performance on both
clean and adversarially-perturbed images rely on convolution operations
conditioned with feature-wise linear modulation (FiLM) layers. These layers
require many new parameters and are hyperparameter sensitive. They
significantly increase training time, memory cost, and potential latency which
can prove costly for resource-limited or real-time applications. In this paper,
we present a fast learnable once-for-all adversarial training (FLOAT)
algorithm, which instead of the existing FiLM-based conditioning, presents a
unique weight conditioned learning that requires no additional layer, thereby
incurring no significant increase in parameter count, training time, or network
latency compared to standard adversarial training. In particular, we add
configurable scaled noise to the weight tensors that enables a trade-off
between clean and adversarial performance. Extensive experiments show that
FLOAT can yield SOTA performance improving both clean and perturbed image
classification by up to ~6% and ~10%, respectively. Moreover, real hardware
measurement shows that FLOAT can reduce the training time by up to 1.43x with
fewer model parameters of up to 1.47x on iso-hyperparameter settings compared
to the FiLM-based alternatives. Additionally, to further improve memory
efficiency we introduce FLOAT sparse (FLOATS), a form of non-iterative model
pruning and provide detailed empirical analysis to provide a three way
accuracy-robustness-complexity trade-off for these new class of pruned
conditionally trained models.
Related papers
- SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models [26.484208658326857]
Continual learning aims to incrementally acquire new concepts in data streams while resisting forgetting previous knowledge.
With the rise of powerful pre-trained models (PTMs), there is a growing interest in training incremental learning systems.
arXiv Detail & Related papers (2024-11-04T15:34:30Z) - Training Over a Distribution of Hyperparameters for Enhanced Performance and Adaptability on Imbalanced Classification [3.06506506650274]
Conditional Loss Training (LCT) can be used to train reliable classifiers under severe class imbalance.
We show that LCT approximates the performance of several models and improves the overall performance of models on both CIFAR and real medical imaging applications.
arXiv Detail & Related papers (2024-10-04T16:47:11Z) - SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - HFT: Half Fine-Tuning for Large Language Models [42.60438623804577]
Large language models (LLMs) with one or more fine-tuning phases have become a necessary step to unlock various capabilities.
In this paper, we find that by regularly resetting partial parameters, LLMs can restore some of the original knowledge.
We introduce Half Fine-Tuning (HFT) for LLMs, as a substitute for full fine-tuning (FFT), to mitigate the forgetting issues.
arXiv Detail & Related papers (2024-04-29T07:07:58Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural
Network Pruning [9.33753001494221]
Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks.
In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation.
arXiv Detail & Related papers (2023-04-08T22:48:30Z) - Scaling & Shifting Your Features: A New Baseline for Efficient Model
Tuning [126.84770886628833]
Existing finetuning methods either tune all parameters of the pretrained model (full finetuning) or only tune the last linear layer (linear probing)
We propose a new parameter-efficient finetuning method termed as SSF, representing that researchers only need to Scale and Shift the deep Features extracted by a pre-trained model to catch up with the performance full finetuning.
arXiv Detail & Related papers (2022-10-17T08:14:49Z) - Online Convolutional Re-parameterization [51.97831675242173]
We present online convolutional re- parameterization (OREPA), a two-stage pipeline, aiming to reduce the huge training overhead by squeezing the complex training-time block into a single convolution.
Compared with the state-of-the-art re-param models, OREPA is able to save the training-time memory cost by about 70% and accelerate the training speed by around 2x.
We also conduct experiments on object detection and semantic segmentation and show consistent improvements on the downstream tasks.
arXiv Detail & Related papers (2022-04-02T09:50:19Z) - Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness
and Accuracy for Free [115.81899803240758]
Adversarial training and its many variants substantially improve deep network robustness, yet at the cost of compromising standard accuracy.
This paper asks how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies.
Our proposed framework, Once-for-all Adversarial Training (OAT), is built on an innovative model-conditional training framework.
arXiv Detail & Related papers (2020-10-22T16:06:34Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.