EipFormer: Emphasizing Instance Positions in 3D Instance Segmentation
- URL: http://arxiv.org/abs/2312.05602v1
- Date: Sat, 9 Dec 2023 16:08:47 GMT
- Title: EipFormer: Emphasizing Instance Positions in 3D Instance Segmentation
- Authors: Mengnan Zhao, Lihe Zhang, Yuqiu Kong and Baocai Yin
- Abstract summary: We present a novel Transformer-based architecture, EipFormer, which comprises progressive aggregation and dual position embedding.
EipFormer achieves superior or comparable performance compared to state-of-the-art approaches.
- Score: 51.996943482875366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D instance segmentation plays a crucial role in comprehending 3D scenes.
Despite recent advancements in this field, existing approaches exhibit certain
limitations. These methods often rely on fixed instance positions obtained from
sampled representative points in vast 3D point clouds, using center prediction
or farthest point sampling. However, these selected positions may deviate from
actual instance centers, posing challenges in precisely grouping instances.
Moreover, the common practice of grouping candidate instances from a single
type of coordinates introduces difficulties in identifying neighboring
instances or incorporating edge points. To tackle these issues, we present a
novel Transformer-based architecture, EipFormer, which comprises progressive
aggregation and dual position embedding. The progressive aggregation mechanism
leverages instance positions to refine instance proposals. It enhances the
initial instance positions through weighted farthest point sampling and further
refines the instance positions and proposals using aggregation averaging and
center matching. Additionally, dual position embedding superposes the original
and centralized position embeddings, thereby enhancing the model performance in
distinguishing adjacent instances. Extensive experiments on popular datasets
demonstrate that EipFormer achieves superior or comparable performance compared
to state-of-the-art approaches.
Related papers
- Instance-free Text to Point Cloud Localization with Relative Position Awareness [37.22900045434484]
Text-to-point-cloud cross-modal localization is an emerging vision-language task critical for future robot-human collaboration.
We address two key limitations of existing approaches: 1) their reliance on ground-truth instances as input; and 2) their neglect of the relative positions among potential instances.
Our proposed model follows a two-stage pipeline, including a coarse stage for text-cell retrieval and a fine stage for position estimation.
arXiv Detail & Related papers (2024-04-27T09:46:49Z) - AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans [41.17467024268349]
Making sense of 3D environments requires fine-grained scene understanding.
We propose to predict instance segmentations for 3D scenes in an unsupervised way.
Our approach attains 13.3% higher Average Precision and 9.1% higher F1 score compared to the best-performing baseline.
arXiv Detail & Related papers (2024-03-24T22:53:16Z) - Position-Guided Point Cloud Panoptic Segmentation Transformer [118.17651196656178]
This work begins by applying this appealing paradigm to LiDAR-based point cloud segmentation and obtains a simple yet effective baseline.
We observe that instances in the sparse point clouds are relatively small to the whole scene and often have similar geometry but lack distinctive appearance for segmentation, which are rare in the image domain.
The method, named Position-guided Point cloud Panoptic segmentation transFormer (P3Former), outperforms previous state-of-the-art methods by 3.4% and 1.2% on Semantic KITTI and nuScenes benchmark, respectively.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Collaborative Propagation on Multiple Instance Graphs for 3D Instance
Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point.
With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information.
Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z) - PointInst3D: Segmenting 3D Instances by Points [136.7261709896713]
We propose a fully-convolutional 3D point cloud instance segmentation method that works in a per-point prediction fashion.
We find the key to its success is assigning a suitable target to each sampled point.
Our approach achieves promising results on both ScanNet and S3DIS benchmarks.
arXiv Detail & Related papers (2022-04-25T02:41:46Z) - DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic
Convolution [136.7261709896713]
We propose a data-driven approach that generates the appropriate convolution kernels to apply in response to the nature of the instances.
The proposed method achieves promising results on both ScanetNetV2 and S3DIS.
It also improves inference speed by more than 25% over the current state-of-the-art.
arXiv Detail & Related papers (2020-11-26T14:56:57Z) - Point-Set Anchors for Object Detection, Instance Segmentation and Pose
Estimation [85.96410825961966]
We argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries.
To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions.
We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation.
arXiv Detail & Related papers (2020-07-06T15:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.