AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding
- URL: http://arxiv.org/abs/2402.17521v3
- Date: Mon, 5 Aug 2024 03:16:03 GMT
- Title: AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding
- Authors: Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Zhe Liu, Zhikang Zou, Xingyu Jiang, Yingying Zhu,
- Abstract summary: This paper presents an advanced sampler that achieves both high accuracy and efficiency.
We propose a Voxel Adaptation Module that adaptively adjusts voxel sizes with the reference of point-based downsampling ratio.
Compared to existing state-of-the-art methods, our approach achieves better accuracy on outdoor and indoor large-scale datasets.
- Score: 16.03214439663472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent advancements in point cloud learning have enabled intelligent vehicles and robots to comprehend 3D environments better. However, processing large-scale 3D scenes remains a challenging problem, such that efficient downsampling methods play a crucial role in point cloud learning. Existing downsampling methods either require a huge computational burden or sacrifice fine-grained geometric information. For such purpose, this paper presents an advanced sampler that achieves both high accuracy and efficiency. The proposed method utilizes voxel centroid sampling as a foundation but effectively addresses the challenges regarding voxel size determination and the preservation of critical geometric cues. Specifically, we propose a Voxel Adaptation Module that adaptively adjusts voxel sizes with the reference of point-based downsampling ratio. This ensures that the sampling results exhibit a favorable distribution for comprehending various 3D objects or scenes. Meanwhile, we introduce a network compatible with arbitrary voxel sizes for sampling and feature extraction while maintaining high efficiency. The proposed approach is demonstrated with 3D object detection and 3D semantic segmentation. Compared to existing state-of-the-art methods, our approach achieves better accuracy on outdoor and indoor large-scale datasets, e.g. Waymo and ScanNet, with promising efficiency.
Related papers
- 3D Object Detection Combining Semantic and Geometric Features from Point
Clouds [19.127930862527666]
We propose a novel end-to-end two-stage 3D object detector named SGNet for point clouds scenes.
The VTPM is a Voxel-Point-Based Module that finally implements 3D object detection in point space.
As of September 19, 2021, for KITTI dataset, SGNet ranked 1st in 3D and BEV detection on cyclists with easy difficulty level, and 2nd in the 3D detection of moderate cyclists.
arXiv Detail & Related papers (2021-10-10T04:43:27Z) - DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic
Voxelization [0.0]
We propose a novel two-stage framework for the efficient 3D point cloud object detection.
We parse the raw point cloud data directly in the 3D space yet achieve impressive efficiency and accuracy.
We highlight our KITTI 3D object detection dataset with 75 FPS and on Open dataset with 25 FPS inference speed with satisfactory accuracy.
arXiv Detail & Related papers (2021-07-27T10:07:39Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object
Detection [59.765645791588454]
Recently introduced RTS3D builds an efficient 4D Feature-Consistency Embedding space for the intermediate representation of object without depth supervision.
We propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
Our proposed method has 2.57% improvement on AP3d almost without extra network parameters.
arXiv Detail & Related papers (2021-06-18T09:14:55Z) - HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object
Detection [39.64891219500416]
3D object detection methods exploit either voxel-based or point-based features to represent 3D objects in a scene.
We introduce in this paper a novel single-stage 3D detection method having the merit of both voxel-based and point-based features.
arXiv Detail & Related papers (2021-04-02T06:34:49Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic
Information Modeling [65.47126868838836]
We propose a novel 3D object detection framework with dynamic information modeling.
Coarse predictions are generated in the first stage via a voxel-based region proposal network.
Experiments are conducted on the large-scale nuScenes 3D detection benchmark.
arXiv Detail & Related papers (2020-07-16T18:27:08Z) - 3DSSD: Point-based 3D Single Stage Object Detector [61.67928229961813]
We present a point-based 3D single stage object detector, named 3DSSD, achieving a good balance between accuracy and efficiency.
Our method outperforms all state-of-the-art voxel-based single stage methods by a large margin, and has comparable performance to two stage point-based methods as well.
arXiv Detail & Related papers (2020-02-24T12:01:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.