Related papers: Uncovering the Hidden Cost of Model Compression

Uncovering the Hidden Cost of Model Compression

URL: http://arxiv.org/abs/2308.14969v3
Date: Fri, 15 Mar 2024 21:04:31 GMT
Title: Uncovering the Hidden Cost of Model Compression
Authors: Diganta Misra, Muawiz Chaudhary, Agam Goyal, Bharat Runwal, Pin Yu Chen,
Abstract summary: Visual Prompting has emerged as a pivotal method for transfer learning in computer vision. Model compression detrimentally impacts the performance of visual prompting-based transfer. However, negative effects on calibration are not present when models are compressed via quantization.
Score: 43.62624133952414
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In an age dominated by resource-intensive foundation models, the ability to efficiently adapt to downstream tasks is crucial. Visual Prompting (VP), drawing inspiration from the prompting techniques employed in Large Language Models (LLMs), has emerged as a pivotal method for transfer learning in the realm of computer vision. As the importance of efficiency continues to rise, research into model compression has become indispensable in alleviating the computational burdens associated with training and deploying over-parameterized neural networks. A primary objective in model compression is to develop sparse and/or quantized models capable of matching or even surpassing the performance of their over-parameterized, full-precision counterparts. Although previous studies have explored the effects of model compression on transfer learning, its impact on visual prompting-based transfer remains unclear. This study aims to bridge this gap, shedding light on the fact that model compression detrimentally impacts the performance of visual prompting-based transfer, particularly evident in scenarios with low data volume. Furthermore, our findings underscore the adverse influence of sparsity on the calibration of downstream visual-prompted models. However, intriguingly, we also illustrate that such negative effects on calibration are not present when models are compressed via quantization. This empirical investigation underscores the need for a nuanced understanding beyond mere accuracy in sparse and quantized settings, thereby paving the way for further exploration in Visual Prompting techniques tailored for sparse and quantized models.

Related papers

On Information Geometry and Iterative Optimization in Model Compression: Operator Factorization [5.952537659103525]
We argue that many successful model compression approaches can be understood as implicitly approximating information divergences for this projection.<n>We prove convergence of iterative singular value thresholding for training neural networks subject to a soft rank constraint.
arXiv Detail & Related papers (2025-07-12T23:39:14Z)
Efficient Point Cloud Classification via Offline Distillation Framework and Negative-Weight Self-Distillation Technique [46.266960248570086]
We introduce an innovative offline recording strategy that avoids the simultaneous loading of both teacher and student models. This approach feeds a multitude of augmented samples into the teacher model, recording both the data augmentation parameters and the corresponding logit outputs. Experimental results demonstrate that the proposed distillation strategy enables the student model to achieve performance comparable to state-of-the-art models.
arXiv Detail & Related papers (2024-09-03T16:12:12Z)
Generalized Nested Latent Variable Models for Lossy Coding applied to Wind Turbine Scenarios [14.48369551534582]
A learning-based approach seeks to minimize the compromise between compression rate and reconstructed image quality. A successful technique consists in introducing a deep hyperprior that operates within a 2-level nested latent variable model. This paper extends this concept by designing a generalized L-level nested generative model with a Markov chain structure.
arXiv Detail & Related papers (2024-06-10T11:00:26Z)
Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models. This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution. We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z)
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets. We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability. Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z)
Model Compression Techniques in Biometrics Applications: A Survey [5.452293986561535]
Deep learning algorithms have extensively empowered humanity's task automatization capacity. The huge improvement in the performance of these models is highly correlated with their increasing level of complexity. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation.
arXiv Detail & Related papers (2024-01-18T17:06:21Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression [68.82486784654817]
We study two popular model compression techniques including knowledge distillation and pruning. We show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets. We develop a regularization strategy for model compression based on sample uncertainty.
arXiv Detail & Related papers (2021-10-16T00:20:04Z)
Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models. Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely. Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z)
Compressed Object Detection [15.893905488328283]
We extend pruning, a compression technique that discards unnecessary model connections, and weight sharing techniques for the task of object detection. We are able to compress a state-of-the-art object detection model by 30.0% without a loss in performance.
arXiv Detail & Related papers (2021-02-04T21:32:56Z)
The Dilemma Between Data Transformations and Adversarial Robustness for Time Series Application Systems [1.2056495277232115]
Adrial examples, or nearly indistinguishable inputs created by an attacker, significantly reduce machine learning accuracy. This work explores how data transformations may impact an adversary's ability to create effective adversarial samples on a recurrent neural network. A data transformation technique reduces the vulnerability to adversarial examples only if it approximates the dataset's intrinsic dimension.
arXiv Detail & Related papers (2020-06-18T22:43:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.