ReVision : A Post-Hoc, Vision-Based Technique for Replacing Unacceptable Concepts in Image Generation Pipeline
- URL: http://arxiv.org/abs/2602.19149v2
- Date: Mon, 02 Mar 2026 07:13:22 GMT
- Title: ReVision : A Post-Hoc, Vision-Based Technique for Replacing Unacceptable Concepts in Image Generation Pipeline
- Authors: Gurjot Singh, Prabhjot Singh, Aashima Sharma, Maninder Singh, Ryan Ko,
- Abstract summary: ReVision is a training-free, prompt-based, post-hoc safety framework for image-generation pipeline.<n>It selectively edits unsafe concepts without altering the underlying generator.<n>It uses the Gemini-2.5-Flash model as a generic policy-violating concept detector.
- Score: 0.695942427153803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-generative models are widely deployed across industries. Recent studies show that they can be exploited to produce policy-violating content. Existing mitigation strategies primarily operate at the pre- or mid-generation stages through techniques such as prompt filtering and safety-aware training/fine-tuning. Prior work shows that these approaches can be bypassed and often degrade generative quality. In this work, we propose ReVision, a training-free, prompt-based, post-hoc safety framework for image-generation pipeline. ReVision acts as a last-line defense by analyzing generated images and selectively editing unsafe concepts without altering the underlying generator. It uses the Gemini-2.5-Flash model as a generic policy-violating concept detector, avoiding reliance on multiple category-specific detectors, and performs localized semantic editing to replace unsafe content. Prior post-hoc editing methods often rely on imprecise spatial localization, that undermines usability and limits deployability, particularly in multi-concept scenes. To address this limitation, ReVision introduces a VLM-assisted spatial gating mechanism that enforces instance-consistent localization, enabling precise edits while preserving scene integrity. We evaluate ReVision on a 245-image benchmark covering both single- and multi-concept scenarios. Results show that ReVision (i) improves CLIP-based alignment toward safe prompts by +$0.121$ on average; (ii) significantly improves multi-concept background fidelity (LPIPS $0.166 \rightarrow 0.058$); (iii) achieves near-complete suppression on category-specific detectors (e.g., NudeNet $70.51 \rightarrow 0$); and (iv) reduces policy-violating content recognizability in a human moderation study from $95.99\%$ to $10.16\%$.
Related papers
- Towards Policy-Adaptive Image Guardrail: Benchmark and Method [21.041111216560545]
Vision-language models (VLMs) offer a more adaptable and generalizable foundation for dynamic safety guardrails.<n>Existing VLM-based safeguarding methods are typically trained and evaluated under only a fixed safety policy.<n>We introduce SafeGuard-VL, a reinforcement learning-based method with verifiable rewards for robust unsafe-image guardrails.
arXiv Detail & Related papers (2026-03-01T18:59:21Z) - SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models [67.84174763413178]
We introduce SafeRedir, a lightweight inference-time framework for robust unlearning via prompt embedding redirection.<n>We show that SafeRedir achieves effective unlearning capability, high semantic and perceptual preservation, robust image quality, and enhanced resistance to adversarial attacks.
arXiv Detail & Related papers (2026-01-13T15:01:38Z) - ConceptGuard: Proactive Safety in Text-and-Image-to-Video Generation through Multimodal Risk Detection [27.47621607462884]
ConceptGuard is a framework for proactively detecting and mitigating unsafe semantics in multimodal video generation.<n>A contrastive detection module identifies latent safety risks by projecting fused image-text inputs into a structured concept space.<n>A semantic suppression mechanism steers the generative process away from unsafe concepts by intervening in the prompt's multimodal conditioning.
arXiv Detail & Related papers (2025-11-24T05:27:05Z) - SafeR-CLIP: Mitigating NSFW Content in Vision-Language Models While Preserving Pre-Trained Knowledge [51.634837361795434]
SaFeR-CLIP reconciles safety and performance, recovering up to 8.0% in zero-shot accuracy over prior methods.<n>We also contribute NSFW-Caps, a new benchmark of 1,000 highly-aligned pairs for testing safety under distributional shift.
arXiv Detail & Related papers (2025-11-20T19:00:15Z) - SafeVision: Efficient Image Guardrail with Robust Policy Adherence and Explainability [49.074914896839466]
We introduce SafeVision, a novel image guardrail that integrates human-like reasoning to enhance adaptability and transparency.<n>Our approach incorporates an effective data collection and generation framework, a policy-following training pipeline, and a customized loss function.<n>We show that SafeVision achieves state-of-the-art performance on different benchmarks.
arXiv Detail & Related papers (2025-10-28T00:35:59Z) - SafeCtrl: Region-Based Safety Control for Text-to-Image Diffusion via Detect-Then-Suppress [48.20360860166279]
We introduce SafeCtrl, a lightweight, non-intrusive plugin that first precisely localizes unsafe content.<n>Instead of performing a hard A-to-B substitution, SafeCtrl then suppresses the harmful semantics, allowing the generative process to naturally and coherently resolve into a safe, context-aware alternative.
arXiv Detail & Related papers (2025-08-16T04:28:52Z) - Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction [88.18235230849554]
Training multimodal generative models on large, uncurated datasets can result in users being exposed to harmful, unsafe and controversial or culturally-inappropriate outputs.<n>We leverage safe embeddings and a modified diffusion process with weighted tunable summation in the latent space to generate safer images.<n>We identify trade-offs between safety and censorship, which presents a necessary perspective in the development of ethical AI models.
arXiv Detail & Related papers (2024-11-21T09:47:13Z) - Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video
Restoration [78.14941737723501]
We propose a Cross-consistent Deep Unfolding Network (CDUN) for All-In-One VR.
By orchestrating two cascading procedures, CDUN achieves adaptive processing for diverse degradations.
In addition, we introduce a window-based inter-frame fusion strategy to utilize information from more adjacent frames.
arXiv Detail & Related papers (2023-09-04T14:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.