Related papers: Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform

Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform

URL: http://arxiv.org/abs/2503.06676v1
Date: Sun, 09 Mar 2025 16:03:48 GMT
Title: Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Authors: Chenyu Huang, Peng Ye, Xiaohui Wang, Shenghe Zheng, Biqing Qi, Lei Bai, Wanli Ouyang, Tao Chen,
Abstract summary: We introduce Delta-DCT, the first data-free delta compression method inspired by classic JPEG image compression, leveraging the Discrete Cosine Transform (DCT)<n>The proposed Delta-DCT does not require any training or data calibration, while achieving performance comparable to or even surpassing original finetuned models under 1-bit equivalent delta compression ratios on different kinds of models including: (1) recently-released LLMs of different sizes from 7B to 13B, (2) relatively smaller language models including RoBERTa and T5 models, (3) variants of vision transformer models, and (4) multi-modal BEiT-3 models.
Score: 51.29604910007176
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With transformer-based models and the pretrain-finetune paradigm becoming mainstream, the high storage and deployment costs of individual finetuned models on multiple tasks pose critical challenges. Delta compression attempts to lower the costs by reducing the redundancy of delta parameters (i.e., the difference between the finetuned and pre-trained model weights). However, existing methods usually face problems including data accessibility and training requirements. To tackle this issue, we introduce Delta-DCT, the first data-free delta compression method inspired by classic JPEG image compression, leveraging the Discrete Cosine Transform (DCT). We first (a) group delta parameters within a layer into patches. Then we (b) assess the importance of each patch and allocate them with different quantization bit-widths. Afterwards, we (c) convert these patches to the DCT domain and conduct quantization to each patch based on the allocated bit-width. The proposed Delta-DCT does not require any training or data calibration, while achieving performance comparable to or even surpassing original finetuned models under 1-bit equivalent delta compression ratios on different kinds of models including: (1) recently-released LLMs of different sizes from 7B to 13B, (2) relatively smaller language models including RoBERTa and T5 models, (3) variants of vision transformer models, and (4) multi-modal BEiT-3 models.

Related papers

ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs [9.435738597849447]
ImPart is a novel importance-aware delta sparsification approach. It adjusts sparsity ratios of different singular vectors based on their importance.
arXiv Detail & Related papers (2025-04-17T16:39:36Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations. Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
DeltaDQ: Ultra-High Delta Compression for Fine-Tuned LLMs via Group-wise Dropout and Separate Quantization [17.501956455837707]
Large language models achieve exceptional performance on various downstream tasks through supervised fine-tuning. Current methods that compress the delta weight struggle to achieve ultra-high compression. We propose a novel distribution-driven delta compression framework DeltaDQ to achieve ultra-high compression for the delta weight.
arXiv Detail & Related papers (2024-10-11T09:44:16Z)
BitDelta: Your Fine-Tune May Only Be Worth One Bit [57.558376557639555]
Large Language Models (LLMs) are typically trained in two phases: pre-training on large internet-scale datasets, and fine-tuning for downstream tasks. We introduce a simple method, BitDelta, which successfully quantizes this delta down to 1 bit without compromising performance. By enabling the use of a single high-precision base model accompanied by multiple 1-bit deltas, BitDelta dramatically reduces GPU memory requirements by more than 10x.
arXiv Detail & Related papers (2024-02-15T18:50:06Z)
OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models [81.7855202178564]
We present OpenDelta, an open-source library that overcomes limitations by providing a plug-and-play implementation of various delta tuning methods. Our novel techniques eliminate the need to modify the backbone PTMs' code, making OpenDelta compatible with different, even novel PTMs.
arXiv Detail & Related papers (2023-07-05T16:30:14Z)
Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger [106.10954454667757]
We present a novel backdoor attack with multiple triggers against learned image compression models. Motivated by the widely used discrete cosine transform (DCT) in existing compression systems and standards, we propose a frequency-based trigger injection model.
arXiv Detail & Related papers (2023-02-28T15:39:31Z)
Knowledge Distillation in Vision Transformers: A Critical Review [6.508088032296086]
Vision Transformers (ViTs) have demonstrated impressive performance improvements over Convolutional Neural Networks (CNNs) Model compression has recently attracted considerable research attention as a potential remedy. This paper discusses various approaches based upon KD for effective compression of ViT models.
arXiv Detail & Related papers (2023-02-04T06:30:57Z)
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models [90.24999406296867]
In contrast with the standard fine-tuning, delta tuning only fine-tunes a small portion of the model parameters while keeping the rest untouched. Recent studies have demonstrated that a series of delta tuning methods with distinct tuned parameter selection could achieve performance on a par with full- parameter fine-tuning.
arXiv Detail & Related papers (2022-03-14T07:56:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.