Related papers: Morphology-optimized Multi-Scale Fusion: Combining Local Artifacts and Mesoscopic Semantics for Deepfake Detection and Localization

Morphology-optimized Multi-Scale Fusion: Combining Local Artifacts and Mesoscopic Semantics for Deepfake Detection and Localization

URL: http://arxiv.org/abs/2509.13776v1
Date: Wed, 17 Sep 2025 07:46:07 GMT
Title: Morphology-optimized Multi-Scale Fusion: Combining Local Artifacts and Mesoscopic Semantics for Deepfake Detection and Localization
Authors: Chao Shuai, Gaojian Wang, Kun Pan, Tong Wu, Fanli Jin, Haohan Tan, Mengxiang Li, Zhenguang Liu, Feng Lin, Kui Ren,
Abstract summary: A common strategy is to incorporate forged region annotations during model training alongside manipulated images.<n>We propose a novel approach that independently predicts manipulated regions using both local and global perspectives.
Score: 30.871239863769404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While the pursuit of higher accuracy in deepfake detection remains a central goal, there is an increasing demand for precise localization of manipulated regions. Despite the remarkable progress made in classification-based detection, accurately localizing forged areas remains a significant challenge. A common strategy is to incorporate forged region annotations during model training alongside manipulated images. However, such approaches often neglect the complementary nature of local detail and global semantic context, resulting in suboptimal localization performance. Moreover, an often-overlooked aspect is the fusion strategy between local and global predictions. Naively combining the outputs from both branches can amplify noise and errors, thereby undermining the effectiveness of the localization. To address these issues, we propose a novel approach that independently predicts manipulated regions using both local and global perspectives. We employ morphological operations to fuse the outputs, effectively suppressing noise while enhancing spatial coherence. Extensive experiments reveal the effectiveness of each module in improving the accuracy and robustness of forgery localization.

Related papers

UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction [83.48950950780554]
Building extraction from remote sensing images is a challenging task due to the complex structure variations of buildings.<n>Existing methods employ convolutional or self-attention blocks to capture the multi-scale features in the segmentation models.<n>We present an Uncertainty-Aggregated Global-Local Fusion Network (UAGLNet) to exploit high-quality global-local visual semantics.
arXiv Detail & Related papers (2025-12-15T02:59:16Z)
Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detection [50.343419243749054]
Anomaly detection is critical in fields such as medical diagnostics and industrial defect detection.<n> CLIP's coarse-grained image-text alignment limits localization and detection performance for fine-grained anomalies.<n>Crane improves the state-of-the-art ZSAD from 2% to 28%, at both image and pixel levels, while remaining competitive in inference speed.
arXiv Detail & Related papers (2025-04-15T10:42:25Z)
Dual Frequency Branch Framework with Reconstructed Sliding Windows Attention for AI-Generated Image Detection [12.523297358258345]
Generative Adversarial Networks (GANs) and diffusion models have enabled the creation of highly realistic synthetic images.<n>Generative Adversarial Networks (GANs) and diffusion models have enabled the creation of highly realistic synthetic images.<n> detecting AI-generated images has emerged as a critical challenge.
arXiv Detail & Related papers (2025-01-25T15:53:57Z)
Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety. Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z)
SEAL: Simultaneous Exploration and Localization in Multi-Robot Systems [0.0]
This paper proposes a novel simultaneous exploration and localization approach. It uses information fusion for maximum exploration while performing communication graph optimization for relative localization. SEAL outperformed cutting-edge methods on exploration and localization performance in extensive ROS-Gazebo simulations.
arXiv Detail & Related papers (2023-06-22T01:27:55Z)
ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation [6.923035780685481]
We propose an efficient local adaptive attention method for geometric aware representation enhancement. We leverage geometric cues from semantic information to learn local adaptive bounding boxes to guide unsupervised feature aggregation. Our proposed method establishes a new state-of-the-art in self-supervised monocular depth estimation task.
arXiv Detail & Related papers (2022-12-12T06:38:35Z)
Adaptive Local-Component-aware Graph Convolutional Network for One-shot Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition. Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z)
Delving into Sequential Patches for Deepfake Detection [64.19468088546743]
Recent advances in face forgery techniques produce nearly untraceable deepfake videos, which could be leveraged with malicious intentions. Previous studies has identified the importance of local low-level cues and temporal information in pursuit to generalize well across deepfake methods. We propose the Local- & Temporal-aware Transformer-based Deepfake Detection framework, which adopts a local-to-global learning protocol.
arXiv Detail & Related papers (2022-07-06T16:46:30Z)
Extending regionalization algorithms to explore spatial process heterogeneity [5.158953116443068]
We propose two new algorithms for spatial regime delineation, two-stage K-Models and Regional-K-Models. Results indicate that all three algorithms achieve superior or comparable performance to existing approaches.
arXiv Detail & Related papers (2022-06-19T15:09:23Z)
Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region. Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training. Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z)
Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation [62.29076080124199]
This paper proposes a novel coarse-to-fine feature adaptation approach to cross-domain object detection. At the coarse-grained stage, foreground regions are extracted by adopting the attention mechanism, and aligned according to their marginal distributions. At the fine-grained stage, we conduct conditional distribution alignment of foregrounds by minimizing the distance of global prototypes with the same category but from different domains.
arXiv Detail & Related papers (2020-03-23T13:40:06Z)
Deep Fusion of Local and Non-Local Features for Precision Landslide Recognition [17.896249114628336]
This paper proposes an effective approach to fuse both local and non-local features to surmount the contextual problem. Built upon the U-Net architecture that is widely adopted in the remote sensing community, we utilize two additional modules. Experimental evaluations revealed that the proposed method outperformed state-of-the-art general-purpose semantic segmentation approaches.
arXiv Detail & Related papers (2020-02-20T03:18:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.