Learned Two-Plane Perspective Prior based Image Resampling for Efficient
Object Detection
- URL: http://arxiv.org/abs/2303.14311v1
- Date: Sat, 25 Mar 2023 00:43:44 GMT
- Title: Learned Two-Plane Perspective Prior based Image Resampling for Efficient
Object Detection
- Authors: Anurag Ghosh, N. Dinesh Reddy, Christoph Mertz, Srinivasa G.
Narasimhan
- Abstract summary: Real-time efficient perception is critical for autonomous navigation and city scale sensing.
In this work, we propose a learnable geometry-guided prior that incorporates rough geometry of the 3D scene.
Our approach improves detection rate by +4.1 $AP_S$ or +39% and in real-time performance by +5.3 $sAP_S$ or +63% for small objects over state-of-the-art (SOTA)
- Score: 20.886999159134138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time efficient perception is critical for autonomous navigation and city
scale sensing. Orthogonal to architectural improvements, streaming perception
approaches have exploited adaptive sampling improving real-time detection
performance. In this work, we propose a learnable geometry-guided prior that
incorporates rough geometry of the 3D scene (a ground plane and a plane above)
to resample images for efficient object detection. This significantly improves
small and far-away object detection performance while also being more efficient
both in terms of latency and memory. For autonomous navigation, using the same
detector and scale, our approach improves detection rate by +4.1 $AP_{S}$ or
+39% and in real-time performance by +5.3 $sAP_{S}$ or +63% for small objects
over state-of-the-art (SOTA). For fixed traffic cameras, our approach detects
small objects at image scales other methods cannot. At the same scale, our
approach improves detection of small objects by 195% (+12.5 $AP_{S}$) over
naive-downsampling and 63% (+4.2 $AP_{S}$) over SOTA.
Related papers
- SOD-YOLOv8 -- Enhancing YOLOv8 for Small Object Detection in Traffic Scenes [1.3812010983144802]
Small Object Detection YOLOv8 (SOD-YOLOv8) is designed for scenarios involving numerous small objects.
SOD-YOLOv8 significantly improves small object detection, surpassing widely used models in various metrics.
In dynamic real-world traffic scenes, SOD-YOLOv8 demonstrated notable improvements in diverse conditions.
arXiv Detail & Related papers (2024-08-08T23:05:25Z) - ESOD: Efficient Small Object Detection on High-Resolution Images [36.80623357577051]
Small objects are usually sparsely distributed and locally clustered.
Massive feature extraction computations are wasted on the non-target background area of images.
We propose to reuse the detector's backbone to conduct feature-level object-seeking and patch-slicing.
arXiv Detail & Related papers (2024-07-23T12:21:23Z) - Fewer is More: Efficient Object Detection in Large Aerial Images [59.683235514193505]
This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results.
Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets.
We extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively.
arXiv Detail & Related papers (2022-12-26T12:49:47Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z) - Analysis and Adaptation of YOLOv4 for Object Detection in Aerial Images [0.0]
Our work shows the adaptation of the popular YOLOv4 framework for predicting the objects and their locations in aerial images.
The trained model resulted in a mean average precision (mAP) of 45.64% with an inference speed reaching 8.7 FPS on the Tesla K80 GPU.
A comparative study with several contemporary aerial object detectors proved that YOLOv4 performed better, implying a more suitable detection algorithm to incorporate on aerial platforms.
arXiv Detail & Related papers (2022-03-18T23:51:09Z) - Tackling the Background Bias in Sparse Object Detection via Cropped
Windows [17.547911599819837]
We propose a simple tiling method that improves the detection capability in the remote sensing case without modifying the model itself.
The procedure was validated on three different data sets and outperformed similar approaches in performance and speed.
arXiv Detail & Related papers (2021-06-04T06:59:56Z) - Analysis of voxel-based 3D object detection methods efficiency for
real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper.
Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances.
Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.