Combining Contrastive and Supervised Learning for Video Super-Resolution
Detection
- URL: http://arxiv.org/abs/2205.10406v1
- Date: Fri, 20 May 2022 18:58:13 GMT
- Title: Combining Contrastive and Supervised Learning for Video Super-Resolution
Detection
- Authors: Viacheslav Meshchaninov, Ivan Molodetskikh, Dmitriy Vatolin
- Abstract summary: We propose a new upscaled-resolution-detection method based on learning of visual representations using contrastive and cross-entropy losses.
Our method effectively detects upscaling even in compressed videos and outperforms the state-of-the-art alternatives.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Upscaled video detection is a helpful tool in multimedia forensics, but it is
a challenging task that involves various upscaling and compression algorithms.
There are many resolution-enhancement methods, including interpolation and
deep-learning-based super-resolution, and they leave unique traces. In this
work, we propose a new upscaled-resolution-detection method based on learning
of visual representations using contrastive and cross-entropy losses. To
explain how the method detects videos, we systematically review the major
components of our framework - in particular, we show that most
data-augmentation approaches hinder the learning of the method. Through
extensive experiments on various datasets, we demonstrate that our method
effectively detects upscaling even in compressed videos and outperforms the
state-of-the-art alternatives. The code and models are publicly available at
https://github.com/msu-video-group/SRDM
Related papers
- AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting
Multiple Experts for Video Deepfake Detection [53.448283629898214]
The recent proliferation of hyper-realistic deepfake videos has drawn attention to the threat of audio and visual forgeries.
Most previous work on detecting AI-generated fake videos only utilize visual modality or audio modality.
We propose an Audio-Visual Transformer-based Ensemble Network (AVTENet) framework that considers both acoustic manipulation and visual manipulation.
arXiv Detail & Related papers (2023-10-19T19:01:26Z) - UVL2: A Unified Framework for Video Tampering Localization [0.0]
Malicious video tampering can lead to public misunderstanding, property losses, and legal disputes.
This paper proposes an effective video tampering localization network that significantly improves the detection performance of video inpainting and splicing.
arXiv Detail & Related papers (2023-09-28T03:13:09Z) - Video Segmentation Learning Using Cascade Residual Convolutional Neural
Network [0.0]
We propose a novel deep learning video segmentation approach that incorporates residual information into the foreground detection learning process.
Experiments conducted on the Change Detection 2014 and on the private dataset PetrobrasROUTES from Petrobras support the effectiveness of the proposed approach.
arXiv Detail & Related papers (2022-12-20T16:56:54Z) - Weakly Supervised Two-Stage Training Scheme for Deep Video Fight
Detection Model [0.0]
Fight detection in videos is an emerging deep learning application with today's prevalence of surveillance systems and streaming media.
Previous work has largely relied on action recognition techniques to tackle this problem.
We design the fight detection model as a composition of an action-aware feature extractor and an anomaly score generator.
arXiv Detail & Related papers (2022-09-23T08:29:16Z) - Deep Video Prior for Video Consistency and Propagation [58.250209011891904]
We present a novel and general approach for blind video temporal consistency.
Our method is only trained on a pair of original and processed videos directly instead of a large dataset.
We show that temporal consistency can be achieved by training a convolutional neural network on a video with Deep Video Prior.
arXiv Detail & Related papers (2022-01-27T16:38:52Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Ada-VSR: Adaptive Video Super-Resolution with Meta-Learning [56.676110454594344]
VideoSuperResolution (Ada-SR) uses external as well as internal, information through meta-transfer learning and internal learning, respectively.
Model trained using our approach can quickly adapt to a specific video condition with only a few gradient updates, which reduces the inference time significantly.
arXiv Detail & Related papers (2021-08-05T19:59:26Z) - Video Super-Resolution with Long-Term Self-Exemplars [38.81851816697984]
We propose a video super-resolution method with long-term cross-scale aggregation.
Our model also consists of a multi-reference alignment module to fuse the features derived from similar patches.
To evaluate our proposed method, we conduct extensive experiments on our collected CarCam dataset and the Open dataset.
arXiv Detail & Related papers (2021-06-24T06:07:13Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - Semi-Supervised Action Recognition with Temporal Contrastive Learning [50.08957096801457]
We learn a two-pathway temporal contrastive model using unlabeled videos at two different speeds.
We considerably outperform video extensions of sophisticated state-of-the-art semi-supervised image recognition methods.
arXiv Detail & Related papers (2021-02-04T17:28:35Z) - Video Anomaly Detection Using Pre-Trained Deep Convolutional Neural Nets
and Context Mining [2.0646127669654835]
We show how to use pre-trained convolutional neural net models to perform feature extraction and context mining.
We derive contextual properties from the high-level features to further improve the performance of our video anomaly detection method.
arXiv Detail & Related papers (2020-10-06T00:26:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.