FOCAL: A Forgery Localization Framework based on Video Coding
Self-Consistency
- URL: http://arxiv.org/abs/2008.10454v2
- Date: Fri, 4 Sep 2020 07:55:11 GMT
- Title: FOCAL: A Forgery Localization Framework based on Video Coding
Self-Consistency
- Authors: Sebastiano Verde, Paolo Bestagini, Simone Milani, Giancarlo Calvagno
and Stefano Tubaro
- Abstract summary: This paper presents a video forgery localization framework that verifies the self-consistency of coding traces between and within video frames.
The overall framework was validated in two typical forgery scenarios: temporal and spatial splicing.
Experimental results show an improvement to the state-of-the-art on temporal splicing localization and also promising performance in the newly tackled case of spatial splicing.
- Score: 26.834506269499094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Forgery operations on video contents are nowadays within the reach of anyone,
thanks to the availability of powerful and user-friendly editing software.
Integrity verification and authentication of videos represent a major interest
in both journalism (e.g., fake news debunking) and legal environments dealing
with digital evidence (e.g., a court of law). While several strategies and
different forensics traces have been proposed in recent years, latest solutions
aim at increasing the accuracy by combining multiple detectors and features.
This paper presents a video forgery localization framework that verifies the
self-consistency of coding traces between and within video frames, by fusing
the information derived from a set of independent feature descriptors. The
feature extraction step is carried out by means of an explainable convolutional
neural network architecture, specifically designed to look for and classify
coding artifacts. The overall framework was validated in two typical forgery
scenarios: temporal and spatial splicing. Experimental results show an
improvement to the state-of-the-art on temporal splicing localization and also
promising performance in the newly tackled case of spatial splicing, on both
synthetic and real-world videos.
Related papers
- TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation [97.96178992465511]
We argue that generated videos should incorporate the emergence of new concepts and their relation transitions like in real-world videos as time progresses.
To assess the Temporal Compositionality of video generation models, we propose TC-Bench, a benchmark of meticulously crafted text prompts, corresponding ground truth videos, and robust evaluation metrics.
arXiv Detail & Related papers (2024-06-12T21:41:32Z) - UVL2: A Unified Framework for Video Tampering Localization [0.0]
Malicious video tampering can lead to public misunderstanding, property losses, and legal disputes.
This paper proposes an effective video tampering localization network that significantly improves the detection performance of video inpainting and splicing.
arXiv Detail & Related papers (2023-09-28T03:13:09Z) - UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for
Temporal Forgery Localization [16.963092523737593]
We propose a novel framework for temporal forgery localization (TFL) that predicts forgery segments with multimodal adaptation.
Our approach achieves state-of-the-art performance on benchmark datasets, including Lav-DF, TVIL, and Psynd.
arXiv Detail & Related papers (2023-08-28T08:20:30Z) - Transform-Equivariant Consistency Learning for Temporal Sentence
Grounding [66.10949751429781]
We introduce a novel Equivariant Consistency Regulation Learning framework to learn more discriminative representations for each video.
Our motivation comes from that the temporal boundary of the query-guided activity should be consistently predicted.
In particular, we devise a self-supervised consistency loss module to enhance the completeness and smoothness of the augmented video.
arXiv Detail & Related papers (2023-05-06T19:29:28Z) - Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS
Instance Segmentation [10.789826145990016]
This paper presents a deep learning framework for medical video segmentation.
Our framework explicitly extracts features from neighbouring frames across the temporal dimension.
It incorporates them with a temporal feature blender, which then tokenises the high-level-temporal feature to form a strong global feature encoded via a Swin Transformer.
arXiv Detail & Related papers (2023-02-22T12:09:39Z) - NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition [89.84188594758588]
A novel Non-saliency Suppression Network (NSNet) is proposed to suppress the responses of non-salient frames.
NSNet achieves the state-of-the-art accuracy-efficiency trade-off and presents a significantly faster (2.44.3x) practical inference speed than state-of-the-art methods.
arXiv Detail & Related papers (2022-07-21T09:41:22Z) - Joint Inductive and Transductive Learning for Video Object Segmentation [107.32760625159301]
Semi-supervised object segmentation is a task of segmenting the target object in a video sequence given only a mask in the first frame.
Most previous best-performing methods adopt matching-based transductive reasoning or online inductive learning.
We propose to integrate transductive and inductive learning into a unified framework to exploit complement between them for accurate and robust video object segmentation.
arXiv Detail & Related papers (2021-08-08T16:25:48Z) - CCVS: Context-aware Controllable Video Synthesis [95.22008742695772]
presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones.
It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control.
arXiv Detail & Related papers (2021-07-16T17:57:44Z) - Coherent Loss: A Generic Framework for Stable Video Segmentation [103.78087255807482]
We investigate how a jittering artifact degrades the visual quality of video segmentation results.
We propose a Coherent Loss with a generic framework to enhance the performance of a neural network against jittering artifacts.
arXiv Detail & Related papers (2020-10-25T10:48:28Z) - Multiple Instance-Based Video Anomaly Detection using Deep Temporal
Encoding-Decoding [5.255783459833821]
We propose a weakly supervised deep temporal encoding-decoding solution for anomaly detection in surveillance videos.
The proposed approach uses both abnormal and normal video clips during the training phase.
The results show that the proposed method performs similar to or better than the state-of-the-art solutions for anomaly detection in video surveillance applications.
arXiv Detail & Related papers (2020-07-03T08:22:42Z) - Near-duplicate video detection featuring coupled temporal and perceptual
visual structures and logical inference based matching [0.0]
We propose an architecture for near-duplicate video detection based on: (i) index and query signature based structures integrating temporal and perceptual visual features.
For matching, we propose to instantiate a retrieval model based on logical inference through the coupling of an N-gram sliding window process and theoretically-sound lattice-based structures.
arXiv Detail & Related papers (2020-05-15T04:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.