VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
- URL: http://arxiv.org/abs/2512.18853v1
- Date: Sun, 21 Dec 2025 18:44:03 GMT
- Title: VizDefender: Unmasking Visualization Tampering through Proactive Localization and Intent Inference
- Authors: Sicheng Song, Yanjie Zhang, Zixin Chen, Huamin Qu, Changbo Wang, Chenhui Li,
- Abstract summary: VizDefender is a framework for tampering detection and analysis.<n>The framework integrates two core components: 1) a semi-fragile watermark module that protects the visualization by embedding a location map to images, and 2) an intent analysis module that leverages Multimodal Large Language Models (MLLMs) to interpret manipulation.
- Score: 53.31458914370742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The integrity of data visualizations is increasingly threatened by image editing techniques that enable subtle yet deceptive tampering. Through a formative study, we define this challenge and categorize tampering techniques into two primary types: data manipulation and visual encoding manipulation. To address this, we present VizDefender, a framework for tampering detection and analysis. The framework integrates two core components: 1) a semi-fragile watermark module that protects the visualization by embedding a location map to images, which allows for the precise localization of tampered regions while preserving visual quality, and 2) an intent analysis module that leverages Multimodal Large Language Models (MLLMs) to interpret manipulation, inferring the attacker's intent and misleading effects. Extensive evaluations and user studies demonstrate the effectiveness of our methods.
Related papers
- Beyond Artificial Misalignment: Detecting and Grounding Semantic-Coordinated Multimodal Manipulations [56.816929931908824]
We pioneer the detection of semantically-coordinated manipulations in multimodal data.<n>We propose a Retrieval-Augmented Manipulation Detection and Grounding (RamDG) framework.<n>Our framework significantly outperforms existing methods, achieving 2.06% higher detection accuracy on SAMM compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-09-16T04:18:48Z) - REVEAL -- Reasoning and Evaluation of Visual Evidence through Aligned Language [0.1388281922732496]
We frame this problem of forgery detection as a prompt-driven visual reasoning task, leveraging the semantic alignment capabilities of large vision-language models.<n>We propose two approaches - (1) Holistic Scene-level Evaluation that relies on the physics, semantics, perspective, and realism of the image as a whole and (2) Region-wise anomaly detection that splits the image into multiple regions and analyzes each of them.
arXiv Detail & Related papers (2025-08-18T00:42:02Z) - OFFSET: Segmentation-based Focus Shift Revision for Composed Image Retrieval [59.377821673653436]
Composed Image Retrieval (CIR) is capable of expressing users' intricate retrieval requirements flexibly.<n>CIR remains in its nascent stages due to two limitations: 1) inhomogeneity between dominant and noisy portions in visual data is ignored, leading to query feature degradation.<n>This work presents a focus mapping-based feature extractor, which consists of two modules: dominant portion segmentation and dual focus mapping.
arXiv Detail & Related papers (2025-07-08T03:27:46Z) - FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models [16.737419222106308]
FakeShield is a framework capable of evaluating image authenticity, generating tampered region masks, and providing a judgment basis based on pixel-level and image-level tampering clues.<n>In experiments, FakeShield effectively detects and localizes various tampering techniques, offering an explainable and superior solution compared to previous IFDL methods.
arXiv Detail & Related papers (2024-10-03T17:59:34Z) - Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system.
Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context.
Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z) - Exploring Saliency Bias in Manipulation Detection [2.156234249946792]
Social media-fuelled explosion of fake news and misinformation supported by tampered images has led to growth in the development of models and datasets for image manipulation detection.<n>Existing detection methods mostly treat media objects in isolation, without considering the impact of specific manipulations on viewer perception.<n>We propose a framework to analyze the trends of visual and semantic saliency in popular image manipulation datasets and their impact on detection.
arXiv Detail & Related papers (2024-02-12T00:08:51Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Object Class Aware Video Anomaly Detection through Image Translation [1.2944868613449219]
This paper proposes a novel two-stream object-aware VAD method that learns the normal appearance and motion patterns through image translation tasks.
The results show that, as significant improvements to previous methods, detections by our method are completely explainable and anomalies are localized accurately in the frames.
arXiv Detail & Related papers (2022-05-03T18:04:27Z) - Detect and Locate: A Face Anti-Manipulation Approach with Semantic and
Noise-level Supervision [67.73180660609844]
We propose a conceptually simple but effective method to efficiently detect forged faces in an image.
The proposed scheme relies on a segmentation map that delivers meaningful high-level semantic information clues about the image.
The proposed model achieves state-of-the-art detection accuracy and remarkable localization performance.
arXiv Detail & Related papers (2021-07-13T02:59:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.