FOVEA: Foveated Image Magnification for Autonomous Navigation
- URL: http://arxiv.org/abs/2108.12102v1
- Date: Fri, 27 Aug 2021 03:07:55 GMT
- Title: FOVEA: Foveated Image Magnification for Autonomous Navigation
- Authors: Chittesh Thavamani, Mengtian Li, Nicolas Cebron, Deva Ramanan
- Abstract summary: We propose an attentional approach that elastically magnifies certain regions while maintaining a small input canvas.
Our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
On the autonomous driving datasets Argoverse-HD and BDD100K, we show our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
- Score: 53.69803081925454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient processing of high-resolution video streams is safety-critical for
many robotics applications such as autonomous driving. Image downsampling is a
commonly adopted technique to ensure the latency constraint is met. However,
this naive approach greatly restricts an object detector's capability to
identify small objects. In this paper, we propose an attentional approach that
elastically magnifies certain regions while maintaining a small input canvas.
The magnified regions are those that are believed to have a high probability of
containing an object, whose signal can come from a dataset-wide prior or
frame-level prior computed from recent object predictions. The magnification is
implemented by a KDE-based mapping to transform the bounding boxes into warping
parameters, which are then fed into an image sampler with anti-cropping
regularization. The detector is then fed with the warped image and we apply a
differentiable backward mapping to get bounding box outputs in the original
space. Our regional magnification allows algorithms to make better use of
high-resolution input without incurring the cost of high-resolution processing.
On the autonomous driving datasets Argoverse-HD and BDD100K, we show our
proposed method boosts the detection AP over standard Faster R-CNN, with and
without finetuning. Additionally, building on top of the previous
state-of-the-art in streaming detection, our method sets a new record for
streaming AP on Argoverse-HD (from 17.8 to 23.0 on a GTX 1080 Ti GPU),
suggesting that it has achieved a superior accuracy-latency tradeoff.
Related papers
- Practical Video Object Detection via Feature Selection and Aggregation [18.15061460125668]
Video object detection (VOD) needs to concern the high across-frame variation in object appearance, and the diverse deterioration in some frames.
Most of contemporary aggregation methods are tailored for two-stage detectors, suffering from high computational costs.
This study invents a very simple yet potent strategy of feature selection and aggregation, gaining significant accuracy at marginal computational expense.
arXiv Detail & Related papers (2024-07-29T02:12:11Z) - ESOD: Efficient Small Object Detection on High-Resolution Images [36.80623357577051]
Small objects are usually sparsely distributed and locally clustered.
Massive feature extraction computations are wasted on the non-target background area of images.
We propose to reuse the detector's backbone to conduct feature-level object-seeking and patch-slicing.
arXiv Detail & Related papers (2024-07-23T12:21:23Z) - Neural Fields with Thermal Activations for Arbitrary-Scale Super-Resolution [56.089473862929886]
We present a novel way to design neural fields such that points can be queried with an adaptive Gaussian PSF.
With its theoretically guaranteed anti-aliasing, our method sets a new state of the art for arbitrary-scale single image super-resolution.
arXiv Detail & Related papers (2023-11-29T14:01:28Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object
Detection [51.153003057515754]
OPA-3D is a single-stage, end-to-end, Occlusion-Aware Pixel-Wise Aggregation network.
It jointly estimates dense scene depth with depth-bounding box residuals and object bounding boxes.
It outperforms state-of-the-art methods on the main Car category.
arXiv Detail & Related papers (2022-11-02T14:19:13Z) - DARDet: A Dense Anchor-free Rotated Object Detector in Aerial Images [11.45718985586972]
We propose a dense anchor-free rotated object detector (DARDet) for rotated object detection in aerial images.
Our DARDet directly predicts five parameters of rotated boxes at each foreground pixel of feature maps.
Our method achieves state-of-the-art performance on three commonly used aerial objects datasets.
arXiv Detail & Related papers (2021-10-03T15:28:14Z) - DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection [16.21161769128316]
We present DAFNe: A one-stage Anchor-Free deep Network for oriented object detection.
As an anchor-free model, DAFNe reduces the prediction complexity by refraining from employing bounding box anchors.
We introduce an orientation-aware generalization of the center-ness function for arbitrarily oriented bounding boxes to down-weight low-quality predictions.
arXiv Detail & Related papers (2021-09-13T17:37:20Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification.
We propose new techniques to push its frontier in two aspects.
Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z) - LR-CNN: Local-aware Region CNN for Vehicle Detection in Aerial Imagery [43.91170581467171]
State-of-the-art object detection approaches have difficulties detecting dense, small targets with arbitrary orientation in large aerial images.
We present the Local-aware Region Convolutional Neural Network (LR-CNN), a novel two-stage approach for vehicle detection in aerial imagery.
arXiv Detail & Related papers (2020-05-28T19:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.