Related papers: UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models

UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models

URL: http://arxiv.org/abs/2402.11846v3
Date: Fri, 14 Jun 2024 03:52:26 GMT
Title: UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models
Authors: Yihua Zhang, Chongyu Fan, Yimeng Zhang, Yuguang Yao, Jinghan Jia, Jiancheng Liu, Gaoyuan Zhang, Gaowen Liu, Ramana Rao Kompella, Xiaoming Liu, Sijia Liu,
Abstract summary: diffusion models (DMs) have demonstrated unprecedented capabilities in text-to-image generation and are widely used in diverse applications. They have also raised significant societal concerns, such as the generation of harmful content and copyright disputes. Machine unlearning (MU) has emerged as a promising solution, capable of removing undesired generative capabilities from DMs.
Score: 31.48739583108113
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The technological advancements in diffusion models (DMs) have demonstrated unprecedented capabilities in text-to-image generation and are widely used in diverse applications. However, they have also raised significant societal concerns, such as the generation of harmful content and copyright disputes. Machine unlearning (MU) has emerged as a promising solution, capable of removing undesired generative capabilities from DMs. However, existing MU evaluation systems present several key challenges that can result in incomplete and inaccurate assessments. To address these issues, we propose UnlearnCanvas, a comprehensive high-resolution stylized image dataset that facilitates the evaluation of the unlearning of artistic styles and associated objects. This dataset enables the establishment of a standardized, automated evaluation framework with 7 quantitative metrics assessing various aspects of the unlearning performance for DMs. Through extensive experiments, we benchmark 9 state-of-the-art MU methods for DMs, revealing novel insights into their strengths, weaknesses, and underlying mechanisms. Additionally, we explore challenging unlearning scenarios for DMs to evaluate worst-case performance against adversarial prompts, the unlearning of finer-scale concepts, and sequential unlearning. We hope that this study can pave the way for developing more effective, accurate, and robust DM unlearning methods, ensuring safer and more ethical applications of DMs in the future. The dataset, benchmark, and codes are publicly available at https://unlearn-canvas.netlify.app/.

Related papers

Rethinking Machine Unlearning in Image Generation Models [59.697750585491264]
CatIGMU is a novel hierarchical task categorization framework.<n>EvalIGMU is a comprehensive evaluation framework.<n>We construct DataIGM, a high-quality unlearning dataset.
arXiv Detail & Related papers (2025-06-03T11:25:14Z)
MUBox: A Critical Evaluation Framework of Deep Machine Unlearning [13.186439491394474]
MUBox is a comprehensive platform designed to evaluate unlearning methods in deep learning.<n> MUBox integrates 23 advanced unlearning techniques, tested across six practical scenarios with 11 diverse evaluation metrics.
arXiv Detail & Related papers (2025-05-13T13:50:51Z)
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation [88.78166077081912]
We introduce a multimodal unlearning benchmark, UnLOK-VQA, and an attack-and-defense framework to evaluate methods for deleting specific multimodal knowledge from MLLMs.<n>Our results show multimodal attacks outperform text- or image-only ones, and that the most effective defense removes answer information from internal model states.
arXiv Detail & Related papers (2025-05-01T01:54:00Z)
PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models [27.338242898495448]
Multimodal large language models (MLLMs) have achieved remarkable success in vision-language tasks.<n>Their reliance on vast, internet-sourced data raises significant privacy and security concerns.<n>Machine unlearning (MU) has emerged as a critical technique to address these issues.<n>PEBench is a novel benchmark designed to facilitate a thorough assessment of MU in MLLMs.
arXiv Detail & Related papers (2025-03-16T15:26:20Z)
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data. This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns. We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z)
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models [71.36392373876505]
We introduce MMIE, a large-scale benchmark for evaluating interleaved multimodal comprehension and generation in Large Vision-Language Models (LVLMs) MMIE comprises 20K meticulously curated multimodal queries, spanning 3 categories, 12 fields, and 102 subfields, including mathematics, coding, physics, literature, health, and arts. It supports both interleaved inputs and outputs, offering a mix of multiple-choice and open-ended question formats to evaluate diverse competencies.
arXiv Detail & Related papers (2024-10-14T04:15:00Z)
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning [8.831339626121848]
We release our comprehensive evaluation framework with the source codes and artifacts. Our investigation reveals that every method has side effects or limitations, especially in more complex and realistic situations.
arXiv Detail & Related papers (2024-10-08T03:30:39Z)
Advances in Diffusion Models for Image Data Augmentation: A Review of Methods, Models, Evaluation Metrics and Future Research Directions [6.2719115566879236]
Diffusion Models (DMs) have emerged as a powerful tool for image data augmentation. DMs generate realistic and diverse images by learning the underlying data distribution. Current challenges and future research directions in the field are discussed.
arXiv Detail & Related papers (2024-07-04T18:06:48Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
Slight Corruption in Pre-training Data Makes Better Diffusion Models [71.90034201302397]
Diffusion models (DMs) have shown remarkable capabilities in generating high-quality images, audios, and videos. DMs benefit significantly from extensive pre-training on large-scale datasets. However, pre-training datasets often contain corrupted pairs where conditions do not accurately describe the data. This paper presents the first comprehensive study on the impact of such corruption in pre-training data of DMs.
arXiv Detail & Related papers (2024-05-30T21:35:48Z)
Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models [42.734578139757886]
Diffusion models (DMs) have achieved remarkable success in text-to-image generation, but they also pose safety risks. The techniques of machine unlearning, also known as concept erasing, have been developed to address these risks. This work aims to enhance the robustness of concept erasing by integrating the principle of adversarial training (AT) into machine unlearning.
arXiv Detail & Related papers (2024-05-24T05:47:23Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now [22.75295925610285]
diffusion models (DMs) have revolutionized the generation of realistic and complex images. DMs also introduce potential safety hazards, such as producing harmful content and infringing data copyrights. Despite the development of safety-driven unlearning techniques, doubts about their efficacy persist.
arXiv Detail & Related papers (2023-10-18T10:36:34Z)
MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs) We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles. Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)
Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold Learning [9.171175292808144]
We propose a novel manifold learning algorithm named Rank Flow Embedding (RFE) for unsupervised and semi-supervised scenarios. RFE computes context-sensitive embeddings, which are refined following a rank-based processing flow. The generated embeddings can be exploited for more effective unsupervised retrieval or semi-supervised classification.
arXiv Detail & Related papers (2023-04-24T21:02:12Z)
Domain Generalization for Mammographic Image Analysis with Contrastive Learning [62.25104935889111]
The training of an efficacious deep learning model requires large data with diverse styles and qualities. A novel contrastive learning is developed to equip the deep learning models with better style generalization capability. The proposed method has been evaluated extensively and rigorously with mammograms from various vendor style domains and several public datasets.
arXiv Detail & Related papers (2023-04-20T11:40:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.