Recent Trends in 2D Object Detection and Applications in Video Event
Recognition
- URL: http://arxiv.org/abs/2202.03206v1
- Date: Mon, 7 Feb 2022 14:15:11 GMT
- Title: Recent Trends in 2D Object Detection and Applications in Video Event
Recognition
- Authors: Prithwish Jana and Partha Pratim Mohanta
- Abstract summary: We discuss the pioneering works in object detection, followed by the recent breakthroughs that employ deep learning.
We highlight recent datasets for 2D object detection both in images and videos, and present a comparative performance summary of various state-of-the-art object detection techniques.
- Score: 0.76146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection serves as a significant step in improving performance of
complex downstream computer vision tasks. It has been extensively studied for
many years now and current state-of-the-art 2D object detection techniques
proffer superlative results even in complex images. In this chapter, we discuss
the geometry-based pioneering works in object detection, followed by the recent
breakthroughs that employ deep learning. Some of these use a monolithic
architecture that takes a RGB image as input and passes it to a feed-forward
ConvNet or vision Transformer. These methods, thereby predict class-probability
and bounding-box coordinates, all in a single unified pipeline. Two-stage
architectures on the other hand, first generate region proposals and then feed
it to a CNN to extract features and predict object category and bounding-box.
We also elaborate upon the applications of object detection in video event
recognition, to achieve better fine-grained video classification performance.
Further, we highlight recent datasets for 2D object detection both in images
and videos, and present a comparative performance summary of various
state-of-the-art object detection techniques.
Related papers
- Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with
Geometric Topology Guidance [6.577227592760559]
UnsMOT is a novel framework that combines appearance and motion features of objects with geometric information to provide more accurate tracking.
Experimental results show remarkable performance in terms of HOTA, IDF1, and MOTA metrics in comparison with state-of-the-art methods.
arXiv Detail & Related papers (2023-09-03T04:58:12Z) - DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention [50.11672196146829]
3D object detection with surround-view images is an essential task for autonomous driving.
We propose DETR4D, a Transformer-based framework that explores sparse attention and direct feature query for 3D object detection in multi-view images.
arXiv Detail & Related papers (2022-12-15T14:18:47Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - Hybrid Optimized Deep Convolution Neural Network based Learning Model
for Object Detection [0.0]
Object identification is one of the most fundamental and difficult issues in computer vision.
In recent years, deep learning-based object detection techniques have grabbed the public's interest.
In this study, a unique deep learning classification technique is used to create an autonomous object detecting system.
The suggested framework has a detection accuracy of 0.9864, which is greater than current techniques.
arXiv Detail & Related papers (2022-03-02T04:39:37Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Ensembling object detectors for image and video data analysis [98.26061123111647]
We propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data.
We extend it to video data by proposing a two-stage tracking-based scheme for detection refinement.
arXiv Detail & Related papers (2021-02-09T12:38:16Z) - End-to-end Deep Object Tracking with Circular Loss Function for Rotated
Bounding Box [68.8204255655161]
We introduce a novel end-to-end deep learning method based on the Transformer Multi-Head Attention architecture.
We also present a new type of loss function, which takes into account the bounding box overlap and orientation.
arXiv Detail & Related papers (2020-12-17T17:29:29Z) - Robust and efficient post-processing for video object detection [9.669942356088377]
This work introduces a novel post-processing pipeline that overcomes some of the limitations of previous post-processing methods.
Our method improves the results of state-of-the-art specific video detectors, specially regarding fast moving objects.
And applied to efficient still image detectors, such as YOLO, provides comparable results to much more computationally intensive detectors.
arXiv Detail & Related papers (2020-09-23T10:47:24Z) - Plug & Play Convolutional Regression Tracker for Video Object Detection [37.47222104272429]
Video object detection targets to simultaneously localize the bounding boxes of the objects and identify their classes in a given video.
One challenge for video object detection is to consistently detect all objects across the whole video.
We propose a Plug & Play scale-adaptive convolutional regression tracker for the video object detection task.
arXiv Detail & Related papers (2020-03-02T15:57:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.