Field-of-View IoU for Object Detection in 360{\deg} Images
- URL: http://arxiv.org/abs/2202.03176v1
- Date: Mon, 7 Feb 2022 14:01:59 GMT
- Title: Field-of-View IoU for Object Detection in 360{\deg} Images
- Authors: Miao Cao, Satoshi Ikehata, and Kiyoharu Aizawa
- Abstract summary: We propose two fundamental techniques -- Field-of-View IoU (FoV-IoU) and 360Augmentation for object detection in 360deg images.
FoV-IoU computes the intersection-over-union of two Field-of-View bounding boxes in a spherical image which could be used for training, inference, and evaluation.
360Augmentation is a data augmentation technique specific to 360deg object detection task which randomly rotates a spherical image and solves the bias due to the sphere-to-plane projection.
- Score: 36.72543749626039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 360{\deg} cameras have gained popularity over the last few years. In this
paper, we propose two fundamental techniques -- Field-of-View IoU (FoV-IoU) and
360Augmentation for object detection in 360{\deg} images. Although most object
detection neural networks designed for the perspective images are applicable to
360{\deg} images in equirectangular projection (ERP) format, their performance
deteriorates owing to the distortion in ERP images. Our method can be readily
integrated with existing perspective object detectors and significantly
improves the performance. The FoV-IoU computes the intersection-over-union of
two Field-of-View bounding boxes in a spherical image which could be used for
training, inference, and evaluation while 360Augmentation is a data
augmentation technique specific to 360{\deg} object detection task which
randomly rotates a spherical image and solves the bias due to the
sphere-to-plane projection. We conduct extensive experiments on the 360indoor
dataset with different types of perspective object detectors and show the
consistent effectiveness of our method.
Related papers
- Towards Generalizable Multi-Camera 3D Object Detection via Perspective
Debiasing [28.874014617259935]
Multi-Camera 3D Object Detection (MC3D-Det) has gained prominence with the advent of bird's-eye view (BEV) approaches.
We propose a novel method that aligns 3D detection with 2D camera plane results, ensuring consistent and accurate detections.
arXiv Detail & Related papers (2023-10-17T15:31:28Z) - Distortion-aware Transformer in 360{\deg} Salient Object Detection [44.74647420381127]
We propose a Transformer-based model called DATFormer to address the distortion problem.
To exploit the unique characteristics of 360deg data, we present a learnable relation matrix.
Our model outperforms existing 2D SOD (salient object detection) and 360 SOD methods.
arXiv Detail & Related papers (2023-08-07T07:28:24Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Multi-Projection Fusion and Refinement Network for Salient Object
Detection in 360{\deg} Omnidirectional Image [141.10227079090419]
We propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360deg omnidirectional image.
MPFR-Net uses the equirectangular projection image and four corresponding cube-unfolding images as inputs.
Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-23T14:50:40Z) - View-aware Salient Object Detection for 360{\deg} Omnidirectional Image [33.43250302656753]
We construct a large scale 360deg ISOD dataset with object-level pixel-wise annotation on equirectangular projection (ERP)
Inspired by humans' observing process, we propose a view-aware salient object detection method based on a Sample Adaptive View Transformer (SAVT) module.
arXiv Detail & Related papers (2022-09-27T07:44:08Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View [117.44028458220427]
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices.
We present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images.
arXiv Detail & Related papers (2020-03-09T15:08:40Z) - A Fixation-based 360{\deg} Benchmark Dataset for Salient Object
Detection [21.314578493964333]
Fixation prediction (FP) in panoramic contents has been widely investigated along with the booming trend of virtual reality (VR) applications.
salient object detection (SOD) has been seldom explored in 360deg images due to the lack of datasets representative of real scenes.
arXiv Detail & Related papers (2020-01-22T11:16:39Z) - Visual Question Answering on 360{\deg} Images [96.00046925811515]
VQA 360 is a novel task of visual question answering on 360 images.
We collect the first VQA 360 dataset, containing around 17,000 real-world image-question-answer triplets for a variety of question types.
arXiv Detail & Related papers (2020-01-10T08:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.