Safeguarding Data in Multimodal AI: A Differentially Private Approach to
CLIP Training
- URL: http://arxiv.org/abs/2306.08173v2
- Date: Fri, 1 Mar 2024 04:24:04 GMT
- Title: Safeguarding Data in Multimodal AI: A Differentially Private Approach to
CLIP Training
- Authors: Alyssa Huang, Peihan Liu, Ryumei Nakada, Linjun Zhang, Wanrong Zhang
- Abstract summary: We introduce a differentially private adaptation of the Contrastive Language-Image Pretraining (CLIP) model.
Our proposed method, Dp-CLIP, is rigorously evaluated on benchmark datasets.
- Score: 15.928338716118697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The surge in multimodal AI's success has sparked concerns over data privacy
in vision-and-language tasks. While CLIP has revolutionized multimodal learning
through joint training on images and text, its potential to unintentionally
disclose sensitive information necessitates the integration of
privacy-preserving mechanisms. We introduce a differentially private adaptation
of the Contrastive Language-Image Pretraining (CLIP) model that effectively
addresses privacy concerns while retaining accuracy. Our proposed method,
Dp-CLIP, is rigorously evaluated on benchmark datasets encompassing diverse
vision-and-language tasks such as image classification and visual question
answering. We demonstrate that our approach retains performance on par with the
standard non-private CLIP model. Furthermore, we analyze our proposed algorithm
under linear representation settings. We derive the convergence rate of our
algorithm and show a trade-off between utility and privacy when gradients are
clipped per-batch and the loss function does not satisfy smoothness conditions
assumed in the literature for the analysis of DP-SGD.
Related papers
- ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference [32.852004564832455]
We re-investigate the architecture of CLIP, and identify residual connections as the primary source of noise that degrades segmentation quality.
We propose ClearCLIP, a novel approach that decomposes CLIP's representations to enhance open-vocabulary semantic segmentation.
arXiv Detail & Related papers (2024-07-17T09:52:20Z) - Enhancing Few-shot CLIP with Semantic-Aware Fine-Tuning [61.902254546858465]
Methods based on Contrastive Language-Image Pre-training have exhibited promising performance in few-shot adaptation tasks.
We propose fine-tuning the parameters of the attention pooling layer during the training process to encourage the model to focus on task-specific semantics.
arXiv Detail & Related papers (2023-11-08T05:18:57Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP [84.90129481336659]
We study transferrable representation learning underlying CLIP and demonstrate how features from different modalities get aligned.
Inspired by our analysis, we propose a new CLIP-type approach, which achieves better performance than CLIP and other state-of-the-art methods on benchmark datasets.
arXiv Detail & Related papers (2023-10-02T06:41:30Z) - Privacy-Preserving In-Context Learning with Differentially Private
Few-Shot Generation [37.55812121348268]
In-context learning (ICL) with large language models (LLMs) on private datasets poses privacy risks.
We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy guarantees.
arXiv Detail & Related papers (2023-09-21T03:59:00Z) - Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings.
To address these concerns, privacy-preserving graph embedding methods have emerged.
We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z) - Non-Contrastive Learning Meets Language-Image Pre-Training [145.6671909437841]
We study the validity of non-contrastive language-image pre-training (nCLIP)
We introduce xCLIP, a multi-tasking framework combining CLIP and nCLIP, and show that nCLIP aids CLIP in enhancing feature semantics.
arXiv Detail & Related papers (2022-10-17T17:57:46Z) - Robust Cross-Modal Representation Learning with Progressive
Self-Distillation [7.676408770854477]
The learning objective of vision-language approach of CLIP does not effectively account for the noisy many-to-many correspondences found in web-harvested image captioning datasets.
We introduce a novel training framework based on cross-modal contrastive learning that uses progressive self-distillation and soft image-text alignments to more efficiently learn robust representations from noisy data.
arXiv Detail & Related papers (2022-04-10T03:28:18Z) - Continual Learning with Differential Privacy [19.186539487598385]
We introduce a notion of continual adjacent databases to bound the sensitivity of any data record participating in the training process of continual learning.
We develop a new DP-preserving algorithm for CL with a data sampling strategy to quantify the privacy risk of training data.
Our algorithm provides formal guarantees of privacy for data records across tasks in CL.
arXiv Detail & Related papers (2021-10-11T12:39:55Z) - Differentially private federated deep learning for multi-site medical
image segmentation [56.30543374146002]
Collaborative machine learning techniques such as federated learning (FL) enable the training of models on effectively larger datasets without data transfer.
Recent initiatives have demonstrated that segmentation models trained with FL can achieve performance similar to locally trained models.
However, FL is not a fully privacy-preserving technique and privacy-centred attacks can disclose confidential patient data.
arXiv Detail & Related papers (2021-07-06T12:57:32Z) - SPEED: Secure, PrivatE, and Efficient Deep learning [2.283665431721732]
We introduce a deep learning framework able to deal with strong privacy constraints.
Based on collaborative learning, differential privacy and homomorphic encryption, the proposed approach advances state-of-the-art.
arXiv Detail & Related papers (2020-06-16T19:31:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.