CompLeak: Deep Learning Model Compression Exacerbates Privacy Leakage
- URL: http://arxiv.org/abs/2507.16872v1
- Date: Tue, 22 Jul 2025 08:02:46 GMT
- Title: CompLeak: Deep Learning Model Compression Exacerbates Privacy Leakage
- Authors: Na Li, Yansong Gao, Hongsheng Hu, Boyu Kuang, Anmin Fu,
- Abstract summary: We propose CompLeak, the first privacy risk framework examining three widely used compression configurations.<n> CompLeak has three variants, given available access to the number of compressed models and original model.
- Score: 14.170572219170158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model compression is crucial for minimizing memory storage and accelerating inference in deep learning (DL) models, including recent foundation models like large language models (LLMs). Users can access different compressed model versions according to their resources and budget. However, while existing compression operations primarily focus on optimizing the trade-off between resource efficiency and model performance, the privacy risks introduced by compression remain overlooked and insufficiently understood. In this work, through the lens of membership inference attack (MIA), we propose CompLeak, the first privacy risk evaluation framework examining three widely used compression configurations that are pruning, quantization, and weight clustering supported by the commercial model compression framework of Google's TensorFlow-Lite (TF-Lite) and Facebook's PyTorch Mobile. CompLeak has three variants, given available access to the number of compressed models and original model. CompLeakNR starts by adopting existing MIA methods to attack a single compressed model, and identifies that different compressed models influence members and non-members differently. When the original model and one compressed model are available, CompLeakSR leverages the compressed model as a reference to the original model and uncovers more privacy by combining meta information (e.g., confidence vector) from both models. When multiple compressed models are available with/without accessing the original model, CompLeakMR innovatively exploits privacy leakage info from multiple compressed versions to substantially signify the overall privacy leakage. We conduct extensive experiments on seven diverse model architectures (from ResNet to foundation models of BERT and GPT-2), and six image and textual benchmark datasets.
Related papers
- Dynamic Base model Shift for Delta Compression [53.505380509713575]
Delta compression attempts to lower the costs by reducing the redundancy of delta parameters.<n>Existing methods by default employ the pretrained model as the base model and compress the delta parameters for every task.<n>We propose Dynamic Base Model Shift (DBMS), which dynamically adapts the base model to the target task before performing delta compression.
arXiv Detail & Related papers (2025-05-16T15:11:19Z) - A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - Application Specific Compression of Deep Learning Models [0.8875650122536799]
Large Deep Learning models are compressed and deployed for specific applications.
Our goal is to customize the model compression process to create a compressed model that will perform better for the target application.
We have experimented with the BERT family of models for three applications: Extractive QA, Natural Language Inference, and Paraphrase Identification.
arXiv Detail & Related papers (2024-09-09T06:55:38Z) - MoDeGPT: Modular Decomposition for Large Language Model Compression [59.361006801465344]
This paper introduces textbfModular bfDecomposition (MoDeGPT), a novel structured compression framework.<n>MoDeGPT partitions the Transformer block into modules comprised of matrix pairs and reduces the hidden dimensions.<n>Our experiments show MoDeGPT, without backward propagation, matches or surpasses previous structured compression methods.
arXiv Detail & Related papers (2024-08-19T01:30:14Z) - xGen-MM (BLIP-3): A Family of Open Large Multimodal Models [152.08958880412936]
BLIP-3 is an open framework for developing Large Multimodal Models.<n>We release 4B and 14B models, including both the pre-trained base model and the instruction fine-tuned ones.<n>Our models demonstrate competitive performance among open-source LMMs with similar model sizes.
arXiv Detail & Related papers (2024-08-16T17:57:01Z) - Unified Low-rank Compression Framework for Click-through Rate Prediction [15.813889566241539]
We propose a unified low-rank decomposition framework for compressing CTR prediction models.
Our framework can achieve better performance than the original model.
Our framework can be applied to embedding tables and layers in various CTR prediction models.
arXiv Detail & Related papers (2024-05-28T13:06:32Z) - Lossless and Near-Lossless Compression for Foundation Models [11.307357041746865]
We investigate the source of model compressibility, introduce compression variants tailored for models and categorize models to compressibility groups.
We estimate that these methods could save over an ExaByte per month of network traffic downloaded from a large model hub like HuggingFace.
arXiv Detail & Related papers (2024-04-05T16:52:55Z) - A Survey on Transformer Compression [84.18094368700379]
Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV)
Model compression methods reduce the memory and computational cost of Transformer.
This survey provides a comprehensive review of recent compression methods, with a specific focus on their application to Transformer-based models.
arXiv Detail & Related papers (2024-02-05T12:16:28Z) - CrAM: A Compression-Aware Minimizer [103.29159003723815]
We propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way.
CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning.
CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware.
arXiv Detail & Related papers (2022-07-28T16:13:28Z) - What do Compressed Large Language Models Forget? Robustness Challenges
in Model Compression [68.82486784654817]
We study two popular model compression techniques including knowledge distillation and pruning.
We show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets.
We develop a regularization strategy for model compression based on sample uncertainty.
arXiv Detail & Related papers (2021-10-16T00:20:04Z) - Lossless Compression with Latent Variable Models [4.289574109162585]
We use latent variable models, which we call 'bits back with asymmetric numeral systems' (BB-ANS)
The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data.
We describe 'Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models.
arXiv Detail & Related papers (2021-04-21T14:03:05Z) - A flexible, extensible software framework for model compression based on
the LC algorithm [10.787390511207683]
We propose a software framework that allows a user to compress a neural network or other machine learning model with minimal effort.
The library is written in Python and PyTorch and available in Github.
arXiv Detail & Related papers (2020-05-15T21:14:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.