Related papers: Motion Vector Extrapolation for Video Object Detection

Motion Vector Extrapolation for Video Object Detection

URL: http://arxiv.org/abs/2104.08918v1
Date: Sun, 18 Apr 2021 17:26:37 GMT
Title: Motion Vector Extrapolation for Video Object Detection
Authors: Julian True and Naimul Khan
Abstract summary: MOVEX enables low latency video object detection on common CPU based systems. We show that our approach significantly reduces the baseline latency of any given object detector. Further latency reduction, up to 25x lower than the original latency, can be achieved with minimal accuracy loss.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the continued successes of computationally efficient deep neural network architectures for video object detection, performance continually arrives at the great trilemma of speed versus accuracy versus computational resources (pick two). Current attempts to exploit temporal information in video data to overcome this trilemma are bottlenecked by the state-of-the-art in object detection models. We present, a technique which performs video object detection through the use of off-the-shelf object detectors alongside existing optical flow based motion estimation techniques in parallel. Through a set of experiments on the benchmark MOT20 dataset, we demonstrate that our approach significantly reduces the baseline latency of any given object detector without sacrificing any accuracy. Further latency reduction, up to 25x lower than the original latency, can be achieved with minimal accuracy loss. MOVEX enables low latency video object detection on common CPU based systems, thus allowing for high performance video object detection beyond the domain of GPU computing. The code is available at https://github.com/juliantrue/movex.

Related papers

ETAD: A Unified Framework for Efficient Temporal Action Detection [70.21104995731085]
Untrimmed video understanding such as temporal action detection (TAD) often suffers from the pain of huge demand for computing resources. We build a unified framework for efficient end-to-end temporal action detection (ETAD) ETAD achieves state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3.
arXiv Detail & Related papers (2022-05-14T21:16:21Z)
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection. We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z)
Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework. It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z)
VideoPose: Estimating 6D object pose from videos [14.210010379733017]
We introduce a simple yet effective algorithm that uses convolutional neural networks to directly estimate object poses from videos. Our proposed network takes a pre-trained 2D object detector as input, and aggregates visual features through a recurrent neural network to make predictions at each frame. Experimental evaluation on the YCB-Video dataset show that our approach is on par with the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-11-20T20:57:45Z)
Video Salient Object Detection via Contrastive Features and Attention Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection. A co-attention formulation is utilized to combine the low-level and high-level features. We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z)
Parallel Detection for Efficient Video Analytics at the Edge [5.547133811014004]
Deep Neural Network (DNN) trained object detectors are widely deployed in mission-critical systems for real time video analytics at the edge. A common performance requirement in mission-critical edge services is the near real-time latency of online object detection on edge devices. This paper addresses these problems by exploiting multi-model multi-device detection parallelism for fast object detection in edge systems.
arXiv Detail & Related papers (2021-07-27T02:50:46Z)
Analysis of voxel-based 3D object detection methods efficiency for real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper. Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances. Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z)
FMODetect: Robust Detection and Trajectory Estimation of Fast Moving Objects [110.29738581961955]
We propose the first learning-based approach for detection and trajectory estimation of fast moving objects. The proposed method first detects all fast moving objects as a truncated distance function to the trajectory. For the sharp appearance estimation, we propose an energy minimization based deblurring.
arXiv Detail & Related papers (2020-12-15T11:05:34Z)
Robust and efficient post-processing for video object detection [9.669942356088377]
This work introduces a novel post-processing pipeline that overcomes some of the limitations of previous post-processing methods. Our method improves the results of state-of-the-art specific video detectors, specially regarding fast moving objects. And applied to efficient still image detectors, such as YOLO, provides comparable results to much more computationally intensive detectors.
arXiv Detail & Related papers (2020-09-23T10:47:24Z)
Joint Detection and Tracking in Videos with Identification Features [36.55599286568541]
We propose the first joint optimization of detection, tracking and re-identification features for videos. Our method reaches the state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge among online trackers, and 3rd overall.
arXiv Detail & Related papers (2020-05-21T21:06:40Z)
Streaming Object Detection for 3-D Point Clouds [29.465873948076766]
LiDAR provides a prominent sensory modality that informs many existing perceptual systems. The latency for perceptual systems based on point cloud data can be dominated by the amount of time for a complete rotational scan. We show how operating on LiDAR data in its native streaming formulation offers several advantages for self driving object detection.
arXiv Detail & Related papers (2020-05-04T21:55:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.