Reduced Spatial Dependency for More General Video-level Deepfake Detection
- URL: http://arxiv.org/abs/2503.03270v1
- Date: Wed, 05 Mar 2025 08:51:55 GMT
- Title: Reduced Spatial Dependency for More General Video-level Deepfake Detection
- Authors: Beilin Chu, Xuan Xu, Yufei Zhang, Weike You, Linna Zhou,
- Abstract summary: We propose a novel method called Spatial Dependency Reduction (SDR), which integrates common temporal consistency features from multiple spatially-perturbed clusters.<n>Extensive benchmarks and ablation studies demonstrate the effectiveness and rationale of our approach.
- Score: 9.51656628987442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As one of the prominent AI-generated content, Deepfake has raised significant safety concerns. Although it has been demonstrated that temporal consistency cues offer better generalization capability, existing methods based on CNNs inevitably introduce spatial bias, which hinders the extraction of intrinsic temporal features. To address this issue, we propose a novel method called Spatial Dependency Reduction (SDR), which integrates common temporal consistency features from multiple spatially-perturbed clusters, to reduce the dependency of the model on spatial information. Specifically, we design multiple Spatial Perturbation Branch (SPB) to construct spatially-perturbed feature clusters. Subsequently, we utilize the theory of mutual information and propose a Task-Relevant Feature Integration (TRFI) module to capture temporal features residing in similar latent space from these clusters. Finally, the integrated feature is fed into a temporal transformer to capture long-range dependencies. Extensive benchmarks and ablation studies demonstrate the effectiveness and rationale of our approach.
Related papers
- Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors.
Existing approaches address these dimensions in isolation, neglecting their critical interdependencies.
In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z) - Surgformer: Surgical Transformer with Hierarchical Temporal Attention for Surgical Phase Recognition [7.682613953680041]
We propose the Surgical Transformer (Surgformer) to address the issues of spatial-temporal modeling and redundancy in an end-to-end manner.
We show that our proposed Surgformer performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-07T16:16:31Z) - SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management.
Traditional methods often struggle to capture the complex dynamics of meteorological systems.
We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z) - Spatio-Temporal Branching for Motion Prediction using Motion Increments [55.68088298632865]
Human motion prediction (HMP) has emerged as a popular research topic due to its diverse applications.
Traditional methods rely on hand-crafted features and machine learning techniques.
We propose a noveltemporal-temporal branching network using incremental information for HMP.
arXiv Detail & Related papers (2023-08-02T12:04:28Z) - Spatio-temporal Diffusion Point Processes [23.74522530140201]
patio-temporal point process (STPP) is a collection of events accompanied with time and space.
The failure to model the joint distribution leads to limited capacities in characterizing the pasthua-temporal interactions given events.
We propose a novel parameterization framework, which learns complex spatial-temporal joint distributions.
Our framework outperforms the state-of-the-art baselines remarkably, with an average improvement over 50%.
arXiv Detail & Related papers (2023-05-21T08:53:00Z) - Local-Global Temporal Difference Learning for Satellite Video Super-Resolution [53.03380679343968]
We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation.
To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies.
Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-10T07:04:40Z) - Spatial-Temporal Graph Convolutional Gated Recurrent Network for Traffic
Forecasting [3.9761027576939414]
We propose a novel framework for traffic forecasting, named Spatial-Temporal Graph Convolutional Gated Recurrent Network (STGCGRN)
We design an attention module to capture long-term dependency by mining periodic information in traffic data.
Experiments on four datasets demonstrate the superior performance of our model.
arXiv Detail & Related papers (2022-10-06T08:02:20Z) - LSTA-Net: Long short-term Spatio-Temporal Aggregation Network for
Skeleton-based Action Recognition [14.078419675904446]
LSTA-Net: a novel short-term Spatio-Temporal Network.
Long/short-term temporal information is not well explored in existing works.
Experiments were conducted on three public benchmark datasets.
arXiv Detail & Related papers (2021-11-01T10:53:35Z) - Supporting Optimal Phase Space Reconstructions Using Neural Network
Architecture for Time Series Modeling [68.8204255655161]
We propose an artificial neural network with a mechanism to implicitly learn the phase spaces properties.
Our approach is either as competitive as or better than most state-of-the-art strategies.
arXiv Detail & Related papers (2020-06-19T21:04:47Z) - Co-Saliency Spatio-Temporal Interaction Network for Person
Re-Identification in Videos [85.6430597108455]
We propose a novel Co-Saliency Spatio-Temporal Interaction Network (CSTNet) for person re-identification in videos.
It captures the common salient foreground regions among video frames and explores the spatial-temporal long-range context interdependency from such regions.
Multiple spatialtemporal interaction modules within CSTNet are proposed, which exploit the spatial and temporal long-range context interdependencies on such features and spatial-temporal information correlation.
arXiv Detail & Related papers (2020-04-10T10:23:58Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.