Recurrent Neural Networks for video object detection
- URL: http://arxiv.org/abs/2010.15740v1
- Date: Thu, 29 Oct 2020 16:40:10 GMT
- Title: Recurrent Neural Networks for video object detection
- Authors: Ahmad B Qasim, Arnd Pettirsch
- Abstract summary: This work compares different methods, especially those which use Recurrent Neural Networks to detect objects in videos.
We differ between feature-based methods, which feed feature maps of different frames into the recurrent units, box-level methods, which feed bounding boxes with class probabilities into the recurrent units and methods which use flow networks.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: There is lots of scientific work about object detection in images. For many
applications like for example autonomous driving the actual data on which
classification has to be done are videos. This work compares different methods,
especially those which use Recurrent Neural Networks to detect objects in
videos. We differ between feature-based methods, which feed feature maps of
different frames into the recurrent units, box-level methods, which feed
bounding boxes with class probabilities into the recurrent units and methods
which use flow networks. This study indicates common outcomes of the compared
methods like the benefit of including the temporal context into object
detection and states conclusions and guidelines for video object detection
networks.
Related papers
- Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Experience feedback using Representation Learning for Few-Shot Object
Detection on Aerial Images [2.8560476609689185]
The performance of our method is assessed on DOTA, a large-scale remote sensing images dataset.
It highlights in particular some intrinsic weaknesses for the few-shot object detection task.
arXiv Detail & Related papers (2021-09-27T13:04:53Z) - Spatio-Temporal Perturbations for Video Attribution [33.19422909074655]
The attribution method provides a direction for interpreting opaque neural networks in a visual way.
We investigate a generic-based attribution method that is compatible with diversified video understanding networks.
We introduce reliable objective metrics which are checked by a newly proposed reliability measurement.
arXiv Detail & Related papers (2021-09-01T07:44:16Z) - JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion
Retargeting [53.28477676794658]
unsupervised motion in videos has seen substantial advancements through the use of deep neural networks.
We introduce JOKR - a JOint Keypoint Representation that handles both the source and target videos, without requiring any object prior or data collection.
We evaluate our method both qualitatively and quantitatively, and demonstrate that our method handles various cross-domain scenarios, such as different animals, different flowers, and humans.
arXiv Detail & Related papers (2021-06-17T17:32:32Z) - Issues in Object Detection in Videos using Common Single-Image CNNs [0.0]
Object detection is used in many applications such as industrial process, medical imaging analysis, and autonomous vehicles.
For applications such as autonomous vehicles, it is crucial that the object detection system can identify objects through multiple frames in video.
There are many neural networks that have been used for object detection and if there was a way of connecting objects between frames then these problems could be eliminated.
A dataset must be created with images that represent consecutive video frames and have matching ground-truth layers.
arXiv Detail & Related papers (2021-05-26T20:33:51Z) - Few-Shot Learning for Video Object Detection in a Transfer-Learning
Scheme [70.45901040613015]
We study the new problem of few-shot learning for video object detection.
We employ a transfer-learning framework to effectively train the video object detector on a large number of base-class objects and a few video clips of novel-class objects.
arXiv Detail & Related papers (2021-03-26T20:37:55Z) - DyStaB: Unsupervised Object Segmentation via Dynamic-Static
Bootstrapping [72.84991726271024]
We describe an unsupervised method to detect and segment portions of images of live scenes that are seen moving as a coherent whole.
Our method first partitions the motion field by minimizing the mutual information between segments.
It uses the segments to learn object models that can be used for detection in a static image.
arXiv Detail & Related papers (2020-08-16T22:05:13Z) - Learning to associate detections for real-time multiple object tracking [0.0]
This study investigates the use of artificial neural networks to learn a similarity function that can be used among detections.
The proposed tracker matches the results obtained by state-of-the-art methods, it has run 58% faster than a recent and similar method, used as baseline.
arXiv Detail & Related papers (2020-07-12T17:08:41Z) - Unsupervised Learning of Video Representations via Dense Trajectory
Clustering [86.45054867170795]
This paper addresses the task of unsupervised learning of representations for action recognition in videos.
We first propose to adapt two top performing objectives in this class - instance recognition and local aggregation.
We observe promising performance, but qualitative analysis shows that the learned representations fail to capture motion patterns.
arXiv Detail & Related papers (2020-06-28T22:23:03Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z) - Video Contents Understanding using Deep Neural Networks [0.0]
We propose a novel application of Transfer Learning to classify video-frame sequences over multiple classes.
This representation is achieved with the advent of "deep neural network" (DNN)
arXiv Detail & Related papers (2020-04-29T05:18:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.