Related papers: A Dual-level Detection Method for Video Copy Detection

Related papers

EDVD-LLaMA: Explainable Deepfake Video Detection via Multimodal Large Language Model Reasoning [58.42596067220998]
deepfake video technology has not only facilitated artistic creation but also made it easier to spread misinformation.<n>Traditional deepfake video detection methods face issues such as a lack of transparency in their principles and insufficient capabilities to cope with forgery techniques.<n>This paper proposes the explainable deepfake video detection (EDVD) task and designs the EDVD-LLaMA multimodal reasoning framework.
arXiv Detail & Related papers (2025-10-18T10:34:05Z)
Counteracting temporal attacks in Video Copy Detection [1.0742675209112622]
The META AI Challenge on video copy detection provided a benchmark for evaluating state-of-the-art methods. Our analysis reveals significant limitations in the VED component, particularly in its ability to handle exact copies. We propose an improved frame selection strategy based on local maxima of interframe differences.
arXiv Detail & Related papers (2025-01-19T21:16:39Z)
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing [57.76170824395532]
Video editing is an emerging task, in which most current methods adopt the pre-trained text-to-image (T2I) diffusion model to edit the source video. We propose COrrespondence-guided Video Editing (COVE) to achieve high-quality and consistent video editing. COVE can be seamlessly integrated into the pre-trained T2I diffusion model without the need for extra training or optimization.
arXiv Detail & Related papers (2024-06-13T06:27:13Z)
AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection [53.448283629898214]
The recent proliferation of hyper-realistic deepfake videos has drawn attention to the threat of audio and visual forgeries. Most previous work on detecting AI-generated fake videos only utilize visual modality or audio modality. We propose an Audio-Visual Transformer-based Ensemble Network (AVTENet) framework that considers both acoustic manipulation and visual manipulation.
arXiv Detail & Related papers (2023-10-19T19:01:26Z)
UVL2: A Unified Framework for Video Tampering Localization [0.0]
Malicious video tampering can lead to public misunderstanding, property losses, and legal disputes. This paper proposes an effective video tampering localization network that significantly improves the detection performance of video inpainting and splicing.
arXiv Detail & Related papers (2023-09-28T03:13:09Z)
Causal Video Summarizer for Video Exploration [74.27487067877047]
Causal Video Summarizer (CVS) is proposed to capture the interactive information between the video and query. Based on the evaluation of the existing multi-modal video summarization dataset, experimental results show that the proposed approach is effective.
arXiv Detail & Related papers (2023-07-04T22:52:16Z)
3rd Place Solution to Meta AI Video Similarity Challenge [1.1470070927586016]
This paper presents our 3rd place solution in the Meta AI Video Similarity Challenge (VSC2022) Our approach builds upon existing image copy detection techniques and incorporates several strategies to exploit on the properties of video data.
arXiv Detail & Related papers (2023-04-24T10:00:09Z)
Feature-compatible Progressive Learning for Video Copy Detection [30.358206867280426]
Video Copy Detection (VCD) has been developed to identify instances of unauthorized or duplicated video content. This paper presents our second place solutions to the Meta AI Video Similarity Challenge (VSC22), CVPR 2023.
arXiv Detail & Related papers (2023-04-20T13:39:47Z)
Weakly Supervised Two-Stage Training Scheme for Deep Video Fight Detection Model [0.0]
Fight detection in videos is an emerging deep learning application with today's prevalence of surveillance systems and streaming media. Previous work has largely relied on action recognition techniques to tackle this problem. We design the fight detection model as a composition of an action-aware feature extractor and an anomaly score generator.
arXiv Detail & Related papers (2022-09-23T08:29:16Z)
Combining Contrastive and Supervised Learning for Video Super-Resolution Detection [0.0]
We propose a new upscaled-resolution-detection method based on learning of visual representations using contrastive and cross-entropy losses. Our method effectively detects upscaling even in compressed videos and outperforms the state-of-the-art alternatives.
arXiv Detail & Related papers (2022-05-20T18:58:13Z)
Cross-category Video Highlight Detection via Set-based Learning [55.49267044910344]
We propose a Dual-Learner-based Video Highlight Detection (DL-VHD) framework. It learns the distinction of target category videos and the characteristics of highlight moments on source video category. It outperforms five typical Unsupervised Domain Adaptation (UDA) algorithms on various cross-category highlight detection tasks.
arXiv Detail & Related papers (2021-08-26T13:06:47Z)
Efficient video integrity analysis through container characterization [77.45740041478743]
We introduce a container-based method to identify the software used to perform a video manipulation. The proposed method is both efficient and effective and can also provide a simple explanation for its decisions. It achieves an accuracy of 97.6% in distinguishing pristine from tampered videos and classifying the editing software.
arXiv Detail & Related papers (2021-01-26T14:13:39Z)
Single Shot Video Object Detector [215.06904478667337]
Single Shot Video Object Detector (SSVD) is a new architecture that novelly integrates feature aggregation into a one-stage detector for object detection in videos. For $448 times 448$ input, SSVD achieves 79.2% mAP on ImageNet VID dataset.
arXiv Detail & Related papers (2020-07-07T15:36:26Z)
Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues [75.1731999380562]
We present a learning-based method for detecting real and fake deepfake multimedia content. We extract and analyze the similarity between the two audio and visual modalities from within the same video. We compare our approach with several SOTA deepfake detection methods and report per-video AUC of 84.4% on the DFDC and 96.6% on the DF-TIMIT datasets.
arXiv Detail & Related papers (2020-03-14T22:07:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.