Related papers: HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models

HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models

URL: http://arxiv.org/abs/2508.00892v1
Date: Sun, 27 Jul 2025 08:44:47 GMT
Title: HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models
Authors: Zhihao Zhu, Jiale Han, Yi Yang,
Abstract summary: HoneyImage is a novel method for dataset ownership verification in image recognition models.<n>HoneyImage selectively modifies a small number of hard samples to embed imperceptible yet verifiable traces.<n>Experiments show that HoneyImage consistently achieves strong verification accuracy with minimal impact on downstream performance.
Score: 20.15391412550277
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image-based AI models are increasingly deployed across a wide range of domains, including healthcare, security, and consumer applications. However, many image datasets carry sensitive or proprietary content, raising critical concerns about unauthorized data usage. Data owners therefore need reliable mechanisms to verify whether their proprietary data has been misused to train third-party models. Existing solutions, such as backdoor watermarking and membership inference, face inherent trade-offs between verification effectiveness and preservation of data integrity. In this work, we propose HoneyImage, a novel method for dataset ownership verification in image recognition models. HoneyImage selectively modifies a small number of hard samples to embed imperceptible yet verifiable traces, enabling reliable ownership verification while maintaining dataset integrity. Extensive experiments across four benchmark datasets and multiple model architectures show that HoneyImage consistently achieves strong verification accuracy with minimal impact on downstream performance while maintaining imperceptible. The proposed HoneyImage method could provide data owners with a practical mechanism to protect ownership over valuable image datasets, encouraging safe sharing and unlocking the full transformative potential of data-driven AI.

Related papers

Leave No TRACE: Black-box Detection of Copyrighted Dataset Usage in Large Language Models via Watermarking [51.74368870268278]
We propose TRACE, a framework for fully black-box detection of copyrighted dataset usage in large language models.<n>textttTRACE rewrites datasets with distortion-free watermarks guided by a private key.<n>Across diverse datasets and model families, TRACE consistently achieves significant detections.
arXiv Detail & Related papers (2025-10-03T12:53:02Z)
CertDW: Towards Certified Dataset Ownership Verification via Conformal Prediction [48.82467166657901]
We propose the first certified dataset watermark (i.e., CertDW) and CertDW-based certified dataset ownership verification method.<n>Inspired by conformal prediction, we introduce two statistical measures, including principal probability (PP) and watermark robustness (WR)<n>We prove there exists a provable lower bound between PP and WR, enabling ownership verification when a suspicious model's WR value significantly exceeds the PP values of benign models trained on watermark-free datasets.
arXiv Detail & Related papers (2025-06-16T07:17:23Z)
RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors [57.81012948133832]
We present RAID (Robust evaluation of AI-generated image Detectors), a dataset of 72k diverse and highly transferable adversarial examples.<n>Our methodology generates adversarial images that transfer with a high success rate to unseen detectors.<n>Our findings indicate that current state-of-the-art AI-generated image detectors can be easily deceived by adversarial examples.
arXiv Detail & Related papers (2025-06-04T14:16:00Z)
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content [53.93606081932928]
We introduce a novel black box detection framework that requires only API access.<n>We measure the likelihood that the image was generated by the model itself.<n>For black-box models that do not support masked image inputs, we incorporate a cost efficient surrogate model trained to align with the target model distribution.
arXiv Detail & Related papers (2025-05-02T05:11:35Z)
Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models [26.821064889438777]
We present novel evidence that diffusion-generated images faithfully preserve the statistical properties of their training data.<n>We introduce emphCoprGuard, a robust frequency domain watermarking framework to safeguard against unauthorized image usage.
arXiv Detail & Related papers (2025-03-14T04:27:50Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models [23.09033991200197]
New personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution poses a new concern regarding whether the personalized models are trained from unauthorized data. We introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models.
arXiv Detail & Related papers (2024-10-14T12:29:23Z)
Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis [3.8809673918404246]
dataset watermarking framework designed to detect unauthorized usage and trace data leaks. We present a dataset watermarking framework designed to detect unauthorized usage and trace data leaks.
arXiv Detail & Related papers (2024-09-27T16:34:48Z)
EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z)
Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness. We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z)
ConfounderGAN: Protecting Image Data Privacy with Causal Confounder [85.6757153033139]
We propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners. Experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets.
arXiv Detail & Related papers (2022-12-04T08:49:14Z)
Black-box Dataset Ownership Verification via Backdoor Watermarking [67.69308278379957]
We formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model. We propose to embed external patterns via backdoor watermarking for the ownership verification to protect them. Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification.
arXiv Detail & Related papers (2022-08-04T05:32:20Z)
Generative Modeling Helps Weak Supervision (and Vice Versa) [87.62271390571837]
We propose a model fusing weak supervision and generative adversarial networks. It captures discrete variables in the data alongside the weak supervision derived label estimate. It is the first approach to enable data augmentation through weakly supervised synthetic images and pseudolabels.
arXiv Detail & Related papers (2022-03-22T20:24:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.