Improving Adversarial Robustness by Contrastive Guided Diffusion Process
- URL: http://arxiv.org/abs/2210.09643v2
- Date: Tue, 4 Jul 2023 12:08:26 GMT
- Title: Improving Adversarial Robustness by Contrastive Guided Diffusion Process
- Authors: Yidong Ouyang, Liyan Xie, Guang Cheng
- Abstract summary: We propose Contrastive-Guided Diffusion Process (Contrastive-DP) to guide the diffusion model in data generation.
We show that enhancing the distinguishability among the generated data is critical for improving adversarial robustness.
- Score: 19.972628281993487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Synthetic data generation has become an emerging tool to help improve the
adversarial robustness in classification tasks since robust learning requires a
significantly larger amount of training samples compared with standard
classification tasks. Among various deep generative models, the diffusion model
has been shown to produce high-quality synthetic images and has achieved good
performance in improving the adversarial robustness. However, diffusion-type
methods are typically slow in data generation as compared with other generative
models. Although different acceleration techniques have been proposed recently,
it is also of great importance to study how to improve the sample efficiency of
generated data for the downstream task. In this paper, we first analyze the
optimality condition of synthetic distribution for achieving non-trivial robust
accuracy. We show that enhancing the distinguishability among the generated
data is critical for improving adversarial robustness. Thus, we propose the
Contrastive-Guided Diffusion Process (Contrastive-DP), which adopts the
contrastive loss to guide the diffusion model in data generation. We verify our
theoretical results using simulations and demonstrate the good performance of
Contrastive-DP on image datasets.
Related papers
- Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks [5.0243930429558885]
This paper introduces Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers.
At the heart of this pipeline is Generative Knowledge Distillation (GKD), the proposed technique that significantly improves the quality and usefulness of the information.
The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases.
arXiv Detail & Related papers (2024-07-22T10:31:07Z) - TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification [0.011037620731410175]
This work aims to guide the generative model to synthesize data with high uncertainty.
We alter the feature space of the autoencoder through an optimization process.
We improve the robustness against test time data augmentations and adversarial attacks on several classifications tasks.
arXiv Detail & Related papers (2024-06-25T11:38:46Z) - Unveiling the Flaws: Exploring Imperfections in Synthetic Data and Mitigation Strategies for Large Language Models [89.88010750772413]
Synthetic data has been proposed as a solution to address the issue of high-quality data scarcity in the training of large language models (LLMs)
Our work delves into these specific flaws associated with question-answer (Q-A) pairs, a prevalent type of synthetic data, and presents a method based on unlearning techniques to mitigate these flaws.
Our work has yielded key insights into the effective use of synthetic data, aiming to promote more robust and efficient LLM training.
arXiv Detail & Related papers (2024-06-18T08:38:59Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Derm-T2IM: Harnessing Synthetic Skin Lesion Data via Stable Diffusion
Models for Enhanced Skin Disease Classification using ViT and CNN [1.0499611180329804]
We aim to incorporate enhanced data transformation techniques by extending the recent success of few-shot learning.
We investigate the impact of incorporating newly generated synthetic data into the training pipeline of state-of-art machine learning models.
arXiv Detail & Related papers (2024-01-10T13:46:03Z) - The Missing U for Efficient Diffusion Models [3.712196074875643]
Diffusion Probabilistic Models yield record-breaking performance in tasks such as image synthesis, video generation, and molecule design.
Despite their capabilities, their efficiency, especially in the reverse process, remains a challenge due to slow convergence rates and high computational costs.
We introduce an approach that leverages continuous dynamical systems to design a novel denoising network for diffusion models.
arXiv Detail & Related papers (2023-10-31T00:12:14Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited
Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.
Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting.
This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - High-Fidelity Synthesis with Disentangled Representation [60.19657080953252]
We propose an Information-Distillation Generative Adrial Network (ID-GAN) for disentanglement learning and high-fidelity synthesis.
Our method learns disentangled representation using VAE-based models, and distills the learned representation with an additional nuisance variable to the separate GAN-based generator for high-fidelity synthesis.
Despite the simplicity, we show that the proposed method is highly effective, achieving comparable image generation quality to the state-of-the-art methods using the disentangled representation.
arXiv Detail & Related papers (2020-01-13T14:39:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.