An Efficient Membership Inference Attack for the Diffusion Model by
Proximal Initialization
- URL: http://arxiv.org/abs/2305.18355v2
- Date: Mon, 9 Oct 2023 06:26:35 GMT
- Title: An Efficient Membership Inference Attack for the Diffusion Model by
Proximal Initialization
- Authors: Fei Kong, Jinhao Duan, RuiPeng Ma, Hengtao Shen, Xiaofeng Zhu,
Xiaoshuang Shi, Kaidi Xu
- Abstract summary: In this paper, we propose an efficient query-based membership inference attack (MIA)
Experimental results indicate that the proposed method can achieve competitive performance with only two queries on both discrete-time and continuous-time diffusion models.
To the best of our knowledge, this work is the first to study the robustness of diffusion models to MIA in the text-to-speech task.
- Score: 58.88327181933151
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, diffusion models have achieved remarkable success in generating
tasks, including image and audio generation. However, like other generative
models, diffusion models are prone to privacy issues. In this paper, we propose
an efficient query-based membership inference attack (MIA), namely Proximal
Initialization Attack (PIA), which utilizes groundtruth trajectory obtained by
$\epsilon$ initialized in $t=0$ and predicted point to infer memberships.
Experimental results indicate that the proposed method can achieve competitive
performance with only two queries on both discrete-time and continuous-time
diffusion models. Moreover, previous works on the privacy of diffusion models
have focused on vision tasks without considering audio tasks. Therefore, we
also explore the robustness of diffusion models to MIA in the text-to-speech
(TTS) task, which is an audio generation task. To the best of our knowledge,
this work is the first to study the robustness of diffusion models to MIA in
the TTS task. Experimental results indicate that models with mel-spectrogram
(image-like) output are vulnerable to MIA, while models with audio output are
relatively robust to MIA. {Code is available at
\url{https://github.com/kong13661/PIA}}.
Related papers
- Diffusion-based Unsupervised Audio-visual Speech Enhancement [26.937216751657697]
This paper proposes a new unsupervised audiovisual speech enhancement (AVSE) approach.
It combines a diffusion-based audio-visual speech generative model with a non-negative matrix factorization (NMF) noise model.
Experimental results confirm that the proposed AVSE approach not only outperforms its audio-only counterpart but also generalizes better than a recent supervisedgenerative AVSE method.
arXiv Detail & Related papers (2024-10-04T12:22:54Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Adversarial Training of Denoising Diffusion Model Using Dual
Discriminators for High-Fidelity Multi-Speaker TTS [0.0]
The diffusion model is capable of generating high-quality data through a probabilistic approach.
It suffers from the drawback of slow generation speed due to the requirement of a large number of time steps.
We propose a speech synthesis model with two discriminators: a diffusion discriminator for learning the distribution of the reverse process and a spectrogram discriminator for learning the distribution of the generated data.
arXiv Detail & Related papers (2023-08-03T07:22:04Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - Are Diffusion Models Vision-And-Language Reasoners? [30.579483430697803]
We transform diffusion-based models for any image-text matching (ITM) task using a novel method called DiffusionITM.
We introduce the Generative-Discriminative Evaluation Benchmark (GDBench) benchmark with 7 complex vision-and-language tasks, bias evaluation and detailed analysis.
We find that Stable Diffusion + DiffusionITM is competitive on many tasks and outperforms CLIP on compositional tasks like CLEVR and Winoground.
arXiv Detail & Related papers (2023-05-25T18:02:22Z) - Diffusion Models as Masked Autoencoders [52.442717717898056]
We revisit generatively pre-training visual representations in light of recent interest in denoising diffusion models.
While directly pre-training with diffusion models does not produce strong representations, we condition diffusion models on masked input and formulate diffusion models as masked autoencoders (DiffMAE)
We perform a comprehensive study on the pros and cons of design choices and build connections between diffusion models and masked autoencoders.
arXiv Detail & Related papers (2023-04-06T17:59:56Z) - Membership Inference Attacks against Diffusion Models [0.0]
Diffusion models have attracted attention in recent years as innovative generative models.
We investigate whether a diffusion model is resistant to a membership inference attack.
arXiv Detail & Related papers (2023-02-07T05:20:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.