Enhanced Few-shot Learning for Intrusion Detection in Railway Video
Surveillance
- URL: http://arxiv.org/abs/2011.04254v1
- Date: Mon, 9 Nov 2020 08:59:15 GMT
- Title: Enhanced Few-shot Learning for Intrusion Detection in Railway Video
Surveillance
- Authors: Xiao Gong, Xi Chen, Wei Chen
- Abstract summary: An enhanced model-agnostic meta-learner is trained using both the original video frames and segmented masks of track area extracted from the video.
Numerical results show that the enhanced meta-learner successfully adapts unseen scene with only few newly collected video frame samples.
- Score: 16.220077781635748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video surveillance is gaining increasing popularity to assist in railway
intrusion detection in recent years. However, efficient and accurate intrusion
detection remains a challenging issue due to: (a) limited sample number: only
small sample size (or portion) of intrusive video frames is available; (b) low
inter-scene dissimilarity: various railway track area scenes are captured by
cameras installed in different landforms; (c) high intra-scene similarity: the
video frames captured by an individual camera share a same backgound. In this
paper, an efficient few-shot learning solution is developed to address the
above issues. In particular, an enhanced model-agnostic meta-learner is trained
using both the original video frames and segmented masks of track area
extracted from the video. Moreover, theoretical analysis and engineering
solutions are provided to cope with the highly similar video frames in the
meta-model training phase. The proposed method is tested on realistic railway
video dataset. Numerical results show that the enhanced meta-learner
successfully adapts unseen scene with only few newly collected video frame
samples, and its intrusion detection accuracy outperforms that of the standard
randomly initialized supervised learning.
Related papers
- Practical Video Object Detection via Feature Selection and Aggregation [18.15061460125668]
Video object detection (VOD) needs to concern the high across-frame variation in object appearance, and the diverse deterioration in some frames.
Most of contemporary aggregation methods are tailored for two-stage detectors, suffering from high computational costs.
This study invents a very simple yet potent strategy of feature selection and aggregation, gaining significant accuracy at marginal computational expense.
arXiv Detail & Related papers (2024-07-29T02:12:11Z) - AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting
Multiple Experts for Video Deepfake Detection [53.448283629898214]
The recent proliferation of hyper-realistic deepfake videos has drawn attention to the threat of audio and visual forgeries.
Most previous work on detecting AI-generated fake videos only utilize visual modality or audio modality.
We propose an Audio-Visual Transformer-based Ensemble Network (AVTENet) framework that considers both acoustic manipulation and visual manipulation.
arXiv Detail & Related papers (2023-10-19T19:01:26Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - Anomaly Crossing: A New Method for Video Anomaly Detection as
Cross-domain Few-shot Learning [32.0713939637202]
Video anomaly detection aims to identify abnormal events that occurred in videos.
Most previous approaches learn only from normal videos using unsupervised or semi-supervised methods.
We propose a new learning paradigm by making full use of both normal and abnormal videos for video anomaly detection.
arXiv Detail & Related papers (2021-12-12T20:49:38Z) - PAT: Pseudo-Adversarial Training For Detecting Adversarial Videos [20.949656274807904]
We propose a novel yet simple algorithm called Pseudo-versa-Adrial Training (PAT) to detect the adversarial frames in a video without requiring knowledge of the attack.
Experimental results on UCF-101 and 20BN-Jester datasets show that PAT can detect the adversarial video frames and videos with a high detection rate.
arXiv Detail & Related papers (2021-09-13T04:05:46Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed
Videos [82.02074241700728]
In this paper, we present a prohibitive-level action recognition model that is trained with only video-frame labels.
Our method per person detectors have been trained on large image datasets within Multiple Instance Learning framework.
We show how we can apply our method in cases where the standard Multiple Instance Learning assumption, that each bag contains at least one instance with the specified label, is invalid.
arXiv Detail & Related papers (2020-07-21T10:45:05Z) - TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training
Model [51.14840210957289]
Multi-object tracking is a fundamental vision problem that has been studied for a long time.
Despite the success of Tracking by Detection (TBD), this two-step method is too complicated to train in an end-to-end manner.
We propose a concise end-to-end model TubeTK which only needs one step training by introducing the bounding-tube" to indicate temporal-spatial locations of objects in a short video clip.
arXiv Detail & Related papers (2020-06-10T06:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.