Beyond Detection: Multi-Scale Hidden-Code for Natural Image Deepfake Recovery and Factual Retrieval
- URL: http://arxiv.org/abs/2602.22759v1
- Date: Thu, 26 Feb 2026 08:47:48 GMT
- Title: Beyond Detection: Multi-Scale Hidden-Code for Natural Image Deepfake Recovery and Factual Retrieval
- Authors: Yuan-Chih Chen, Chun-Shien Lu,
- Abstract summary: We propose a unified hidden-code recovery framework that enables both retrieval and restoration from post-hoc and in-generation watermarking paradigms.<n>Our method encodes semantic and perceptual information into a compact hidden-code representation, refined through multi-scale vector quantization, and enhances contextual reasoning via conditional Transformer modules.
- Score: 10.94034043296029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in image authenticity have primarily focused on deepfake detection and localization, leaving recovery of tampered contents for factual retrieval relatively underexplored. We propose a unified hidden-code recovery framework that enables both retrieval and restoration from post-hoc and in-generation watermarking paradigms. Our method encodes semantic and perceptual information into a compact hidden-code representation, refined through multi-scale vector quantization, and enhances contextual reasoning via conditional Transformer modules. To enable systematic evaluation for natural images, we construct ImageNet-S, a benchmark that provides paired image-label factual retrieval tasks. Extensive experiments on ImageNet-S demonstrate that our method exhibits promising retrieval and reconstruction performance while remaining fully compatible with diverse watermarking pipelines. This framework establishes a foundation for general-purpose image recovery beyond detection and localization.
Related papers
- Learning to Restore Multi-Degraded Images via Ingredient Decoupling and Task-Aware Path Adaptation [51.10017611491389]
Real-world images often suffer from multiple coexisting degradations, such as rain, noise, and haze coexisting in a single image.<n>We propose an adaptive multi-degradation image restoration network that reconstructs images by leveraging decoupled representations of degradation ingredients.<n>The resulting tightly integrated architecture, termed IMDNet, is extensively validated through experiments.
arXiv Detail & Related papers (2025-11-07T01:50:36Z) - Semantic-Aware Reconstruction Error for Detecting AI-Generated Images [22.83053631078616]
We propose a novel representation, namely Semantic-Aware Reconstruction Error (SARE), that measures the semantic difference between an image and its caption-guided reconstruction.<n>SARE provides a robust and discriminative feature for detecting fake images across diverse generative models.<n>We also introduce a fusion module that integrates SARE into the backbone detector via a cross-attention mechanism.
arXiv Detail & Related papers (2025-08-13T04:37:36Z) - Breaking the Frame: Visual Place Recognition by Overlap Prediction [53.17564423756082]
We propose a novel visual place recognition approach based on overlap prediction, called VOP.<n>VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone.<n>Our approach uses a voting mechanism to assess overlap scores for potential database images.
arXiv Detail & Related papers (2024-06-23T20:00:20Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Prompt-based Ingredient-Oriented All-in-One Image Restoration [0.0]
We propose a novel data ingredient-oriented approach to tackle multiple image degradation tasks.
Specifically, we utilize a encoder to capture features and introduce prompts with degradation-specific information to guide the decoder.
Our method performs competitively to the state-of-the-art.
arXiv Detail & Related papers (2023-09-06T15:05:04Z) - WMFormer++: Nested Transformer for Visible Watermark Removal via Implict
Joint Learning [68.00975867932331]
Existing watermark removal methods mainly rely on UNet with task-specific decoder branches.
We introduce an implicit joint learning paradigm to holistically integrate information from both branches.
The results demonstrate our approach's remarkable superiority, surpassing existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2023-08-20T07:56:34Z) - Label-Free Event-based Object Recognition via Joint Learning with Image
Reconstruction from Events [42.71383489578851]
We study label-free event-based object recognition where category labels and paired images are not available.
Our method first reconstructs images from events and performs object recognition through Contrastive Language-Image Pre-training (CLIP)
Since the category information is essential in reconstructing images, we propose category-guided attraction loss and category-agnostic repulsion loss.
arXiv Detail & Related papers (2023-08-18T08:28:17Z) - All-in-one Multi-degradation Image Restoration Network via Hierarchical
Degradation Representation [47.00239809958627]
We propose a novel All-in-one Multi-degradation Image Restoration Network (AMIRNet)
AMIRNet learns a degradation representation for unknown degraded images by progressively constructing a tree structure through clustering.
This tree-structured representation explicitly reflects the consistency and discrepancy of various distortions, providing a specific clue for image restoration.
arXiv Detail & Related papers (2023-08-06T04:51:41Z) - Exploring Resolution and Degradation Clues as Self-supervised Signal for
Low Quality Object Detection [77.3530907443279]
We propose a novel self-supervised framework to detect objects in degraded low resolution images.
Our methods has achieved superior performance compared with existing methods when facing variant degradation situations.
arXiv Detail & Related papers (2022-08-05T09:36:13Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - RestoreDet: Degradation Equivariant Representation for Object Detection
in Low Resolution Images [81.91416537019835]
We propose a novel framework, RestoreDet, to detect objects in degraded low resolution images.
Our framework based on CenterNet has achieved superior performance compared with existing methods when facing variant degradation situations.
arXiv Detail & Related papers (2022-01-07T03:40:23Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.