3D Object Detection with Pointformer
- URL: http://arxiv.org/abs/2012.11409v1
- Date: Mon, 21 Dec 2020 15:12:54 GMT
- Title: 3D Object Detection with Pointformer
- Authors: Xuran Pan, Zhuofan Xia, Shiji Song, Li Erran Li, Gao Huang
- Abstract summary: We propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively.
A Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level.
A Global Transformer is designed to learn context-aware representations at the scene level.
- Score: 29.935891419574602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature learning for 3D object detection from point clouds is very
challenging due to the irregularity of 3D point cloud data. In this paper, we
propose Pointformer, a Transformer backbone designed for 3D point clouds to
learn features effectively. Specifically, a Local Transformer module is
employed to model interactions among points in a local region, which learns
context-dependent region features at an object level. A Global Transformer is
designed to learn context-aware representations at the scene level. To further
capture the dependencies among multi-scale representations, we propose
Local-Global Transformer to integrate local features with global features from
higher resolution. In addition, we introduce an efficient coordinate refinement
module to shift down-sampled points closer to object centroids, which improves
object proposal generation. We use Pointformer as the backbone for
state-of-the-art object detection models and demonstrate significant
improvements over original models on both indoor and outdoor datasets.
Related papers
- TransPose: 6D Object Pose Estimation with Geometry-Aware Transformer [16.674933679692728]
TransPose is a novel 6D pose framework that exploits Transformer with geometry-aware module to develop better learning of point cloud feature representations.
TransPose achieves competitive results on three benchmark datasets.
arXiv Detail & Related papers (2023-10-25T01:24:12Z) - APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud
Understanding [20.87092793669536]
Transformer-based networks have achieved impressive performance in 3D point cloud understanding.
To tackle these problems, we propose Asymmetric Parallel Point Transformer (APPT)
APPT is able to capture features globally throughout the entire network while focusing on local-detailed features.
arXiv Detail & Related papers (2023-03-31T06:11:02Z) - Local region-learning modules for point cloud classification [0.0]
We present two local region-learning modules that infer the appropriate shift for each center point and alter the radius of each local region.
We integrated both modules independently and together to the PointNet++ and PointCNN object classification architectures.
Our experiments on ShapeNet data set showed that the modules are also effective on 3D CAD models.
arXiv Detail & Related papers (2023-03-30T12:45:46Z) - Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors.
First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning.
Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z) - Learning Object-level Point Augmentor for Semi-supervised 3D Object
Detection [85.170578641966]
We propose an object-level point augmentor (OPA) that performs local transformations for semi-supervised 3D object detection.
In this way, the resultant augmentor is derived to emphasize object instances rather than irrelevant backgrounds.
Experiments on the ScanNet and SUN RGB-D datasets show that the proposed OPA performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2022-12-19T06:56:14Z) - LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
Propagation in Transformers [60.51925353387151]
We propose a novel module named Local Context Propagation (LCP) to exploit the message passing between neighboring local regions.
We use the overlap points of adjacent local regions as intermediaries, then re-weight the features of these shared points from different local regions before passing them to the next layers.
The proposed method is applicable to different tasks and outperforms various transformer-based methods in benchmarks including 3D shape classification and dense prediction tasks.
arXiv Detail & Related papers (2022-10-23T15:43:01Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - RBGNet: Ray-based Grouping for 3D Object Detection [104.98776095895641]
We propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds.
We propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays.
Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains.
arXiv Detail & Related papers (2022-04-05T14:42:57Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z) - LATFormer: Locality-Aware Point-View Fusion Transformer for 3D Shape
Recognition [38.540048855119004]
We propose a novel Locality-Aware Point-View Fusion Transformer (LATFormer) for 3D shape retrieval and classification.
The core component of LATFormer is a module named Locality-Aware Fusion (LAF) which integrates the local features of correlated regions across the two modalities.
In our LATFormer, we utilize the LAF module to fuse the multi-scale features of the two modalities both bidirectionally and hierarchically to obtain more informative features.
arXiv Detail & Related papers (2021-09-03T03:23:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.