FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
- URL: http://arxiv.org/abs/2112.00322v1
- Date: Wed, 1 Dec 2021 07:28:52 GMT
- Title: FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection
- Authors: Danila Rukhovich, Anna Vorontsova, Anton Konushin
- Abstract summary: We present FCAF3D - a first-in-class fully convolutional anchor-free indoor 3D object detection method.
It is a simple yet effective method that uses a voxel representation of a point cloud and processes voxels with sparse convolutions.
It can handle large-scale scenes with minimal runtime through a single fully convolutional feed-forward pass.
- Score: 3.330229314824913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, promising applications in robotics and augmented reality have
attracted considerable attention to 3D object detection from point clouds. In
this paper, we present FCAF3D - a first-in-class fully convolutional
anchor-free indoor 3D object detection method. It is a simple yet effective
method that uses a voxel representation of a point cloud and processes voxels
with sparse convolutions. FCAF3D can handle large-scale scenes with minimal
runtime through a single fully convolutional feed-forward pass. Existing 3D
object detection methods make prior assumptions on the geometry of objects, and
we argue that it limits their generalization ability. To get rid of any prior
assumptions, we propose a novel parametrization of oriented bounding boxes that
allows obtaining better results in a purely data-driven way. The proposed
method achieves state-of-the-art 3D object detection results in terms of
mAP@0.5 on ScanNet V2 (+4.5), SUN RGB-D (+3.5), and S3DIS (+20.5) datasets. The
code and models are available at https://github.com/samsunglabs/fcaf3d.
Related papers
- DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [78.25819070166351]
We propose VoxelNext for fully sparse 3D object detection.
Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.
Our strong sparse convolutional network VoxelNeXt detects and tracks 3D objects through voxel features entirely.
arXiv Detail & Related papers (2023-03-20T17:40:44Z) - TR3D: Towards Real-Time Indoor 3D Object Detection [6.215404942415161]
TR3D is a fully-convolutional 3D object detection model trained end-to-end.
To take advantage of both point cloud and RGB inputs, we introduce an early fusion of 2D and 3D features.
Our model with early feature fusion, which we refer to as TR3D+FF, outperforms existing 3D object detection approaches on the SUN RGB-D dataset.
arXiv Detail & Related papers (2023-02-06T15:25:50Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle
Detection [81.79171905308827]
We propose frustum-aware geometric reasoning (FGR) to detect vehicles in point clouds without any 3D annotations.
Our method consists of two stages: coarse 3D segmentation and 3D bounding box estimation.
It is able to accurately detect objects in 3D space with only 2D bounding boxes and sparse point clouds.
arXiv Detail & Related papers (2021-05-17T07:29:55Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z) - SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint
Estimation [3.1542695050861544]
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
We propose a novel 3D object detection method, named SMOKE, that combines a single keypoint estimate with regressed 3D variables.
Despite of its structural simplicity, our proposed SMOKE network outperforms all existing monocular 3D detection methods on the KITTI dataset.
arXiv Detail & Related papers (2020-02-24T08:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.