Related papers: VICTOR: Dataset Copyright Auditing in Video Recognition Systems

VICTOR: Dataset Copyright Auditing in Video Recognition Systems

URL: http://arxiv.org/abs/2512.14439v1
Date: Tue, 16 Dec 2025 14:26:01 GMT
Title: VICTOR: Dataset Copyright Auditing in Video Recognition Systems
Authors: Quan Yuan, Zhikun Zhang, Linkang Du, Min Chen, Mingyang Sun, Yunjun Gao, Shibo He, Jiming Chen,
Abstract summary: We propose VICTOR, the first dataset copyright auditing approach for video recognition systems.<n> VICTOR amplifies the impact of published modified samples on the prediction behavior of the target models.<n>We show that VICTOR is robust in the presence of several perturbation mechanisms to the training videos or the target models.
Score: 47.270150440169324
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video recognition systems are increasingly being deployed in daily life, such as content recommendation and security monitoring. To enhance video recognition development, many institutions have released high-quality public datasets with open-source licenses for training advanced models. At the same time, these datasets are also susceptible to misuse and infringement. Dataset copyright auditing is an effective solution to identify such unauthorized use. However, existing dataset copyright solutions primarily focus on the image domain; the complex nature of video data leaves dataset copyright auditing in the video domain unexplored. Specifically, video data introduces an additional temporal dimension, which poses significant challenges to the effectiveness and stealthiness of existing methods. In this paper, we propose VICTOR, the first dataset copyright auditing approach for video recognition systems. We develop a general and stealthy sample modification strategy that enhances the output discrepancy of the target model. By modifying only a small proportion of samples (e.g., 1%), VICTOR amplifies the impact of published modified samples on the prediction behavior of the target models. Then, the difference in the model's behavior for published modified and unpublished original samples can serve as a key basis for dataset auditing. Extensive experiments on multiple models and datasets highlight the superiority of VICTOR. Finally, we show that VICTOR is robust in the presence of several perturbation mechanisms to the training videos or the target models.

Related papers

Temporal Unlearnable Examples: Preventing Personal Video Data from Unauthorized Exploitation by Object Tracking [90.81846867441993]
This paper presents the first investigation on preventing personal video data from unauthorized exploitation by deep trackers.<n>We propose a novel generative framework for generating Temporal Unlearnable Examples (TUEs)<n>Our approach achieves state-of-the-art performance in video data-privacy protection, with strong transferability across VOT models, datasets, and temporal matching tasks.
arXiv Detail & Related papers (2025-07-10T07:11:33Z)
Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation [31.737159092430108]
We study different generative architectures, searching and identifying discriminative features that are unbiased, robust to impairments, and shared across models.<n>We introduce a novel data augmentation strategy based on the wavelet decomposition and replace specific frequency-related bands to drive the model to exploit more relevant forensic cues.<n>Our method achieves a significant accuracy improvement over state-of-the-art detectors and obtains excellent results even on very recent generative models.
arXiv Detail & Related papers (2025-06-20T07:36:59Z)
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM [2.5988879420706095]
Video anomaly detection (VAD) is crucial for video analysis and surveillance in computer vision.<n>Existing VAD models rely on learned normal patterns, which makes them difficult to apply to diverse environments.<n>This study proposes customizable video anomaly detection (C-VAD) technique and the AnyAnomaly model.
arXiv Detail & Related papers (2025-03-06T14:52:34Z)
Can Large Vision-Language Models Detect Images Copyright Infringement from GenAI? [22.898606027486593]
We focus on evaluating the copyright detection abilities of state-of-the-art LVLMs using a various set of image samples.<n>We construct a benchmark dataset comprising positive samples that violate the copyright protection of well-known IP figures, as well as negative samples that resemble these figures but do not raise copyright concerns.<n>Our experimental results reveal that LVLMs are prone to overfitting, leading to the misclassification of some negative samples as IP-infringement cases.
arXiv Detail & Related papers (2025-02-23T15:41:12Z)
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images [9.351260848685229]
Large vision-language models (LVLMs) have demonstrated remarkable image understanding and dialogue capabilities.<n>Their widespread availability raises concerns about unauthorized usage and copyright infringement.<n>We propose a novel method called Learning Attack (PLA) for tracking the copyright of LVLMs without modifying the original model.
arXiv Detail & Related papers (2025-02-23T14:49:34Z)
CAP: Detecting Unauthorized Data Usage in Generative Models via Prompt Generation [1.6141139250981018]
Copyright Audit via Prompts generation (CAP) is a framework for automatically testing whether an ML model has been trained with unauthorized data. Specifically, we devise an approach to generate suitable keys inducing the model to reveal copyrighted contents. To prove its effectiveness, we conducted an extensive evaluation campaign on measurements collected in four IoT scenarios.
arXiv Detail & Related papers (2024-10-08T08:49:41Z)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs [64.60035916955837]
VANE-Bench is a benchmark designed to assess the proficiency of Video-LMMs in detecting anomalies and inconsistencies in videos.<n>Our dataset comprises an array of videos synthetically generated using existing state-of-the-art text-to-video generation models.<n>We evaluate nine existing Video-LMMs, both open and closed sources, on this benchmarking task and find that most of the models encounter difficulties in effectively identifying the subtle anomalies.
arXiv Detail & Related papers (2024-06-14T17:59:01Z)
Generative Models are Self-Watermarked: Declaring Model Authentication through Re-Generation [17.88043926057354]
verifying data ownership poses formidable challenges, particularly in cases of unauthorized reuse of generated data. Our work is dedicated to detecting data reuse from even an individual sample. We propose an explainable verification procedure that attributes data ownership through re-generation, and further amplifies these fingerprints in the generative models through iterative data re-generation.
arXiv Detail & Related papers (2024-02-23T10:48:21Z)
A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works. Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement. We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z)
Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z)
AVTENet: A Human-Cognition-Inspired Audio-Visual Transformer-Based Ensemble Network for Video Deepfake Detection [49.81915942821647]
This study introduces the audio-visual transformer-based ensemble network (AVTENet) to detect deepfake videos.<n>For evaluation, we use the recently released benchmark multimodal audio-video FakeAVCeleb dataset.<n>For a detailed analysis, we evaluate AVTENet, its variants, and several existing methods on multiple test sets of the FakeAVCeleb dataset.
arXiv Detail & Related papers (2023-10-19T19:01:26Z)
Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation [62.458968086881555]
Continuous Video Domain Adaptation (CVDA) is a scenario where a source model is required to adapt to a series of individually available changing target domains. We propose a Confidence-Attentive network with geneRalization enhanced self-knowledge disTillation (CART) to address the challenge in CVDA.
arXiv Detail & Related papers (2023-03-18T16:40:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.