Related papers: SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation

SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation

URL: http://arxiv.org/abs/2004.01803v2
Date: Tue, 13 Apr 2021 09:42:51 GMT
Title: SqueezeSegV3: Spatially-Adaptive Convolution for Efficient Point-Cloud Segmentation
Authors: Chenfeng Xu, Bichen Wu, Zining Wang, Wei Zhan, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka
Abstract summary: For large-scale point cloud segmentation, the textitde facto method is to project a 3D point cloud to get a 2D LiDAR image and use convolutions to process it. We propose Spatially-Adaptive Convolution (SAC) to adopt different filters for different locations according to the input image. SAC can be computed efficiently since it can be implemented as a series of element-wise multiplications, im2col, and standard convolution.
Score: 66.49351944322835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LiDAR point-cloud segmentation is an important problem for many applications. For large-scale point cloud segmentation, the \textit{de facto} method is to project a 3D point cloud to get a 2D LiDAR image and use convolutions to process it. Despite the similarity between regular RGB and LiDAR images, we discover that the feature distribution of LiDAR images changes drastically at different image locations. Using standard convolutions to process such LiDAR images is problematic, as convolution filters pick up local features that are only active in specific regions in the image. As a result, the capacity of the network is under-utilized and the segmentation performance decreases. To fix this, we propose Spatially-Adaptive Convolution (SAC) to adopt different filters for different locations according to the input image. SAC can be computed efficiently since it can be implemented as a series of element-wise multiplications, im2col, and standard convolution. It is a general framework such that several previous methods can be seen as special cases of SAC. Using SAC, we build SqueezeSegV3 for LiDAR point-cloud segmentation and outperform all previous published methods by at least 3.7% mIoU on the SemanticKITTI benchmark with comparable inference speed.

Related papers

GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting [64.84383010238908]
We propose an effective image tokenizer with 2D Gaussian Splatting as a solution. In general, our framework integrates the local influence of 2D Gaussian distribution into the discrete space. Competitive reconstruction performances on CIFAR, Mini-Net, and ImageNet-1K demonstrate the effectiveness of our framework.
arXiv Detail & Related papers (2025-01-26T17:56:11Z)
Spherical Frustum Sparse Convolution Network for LiDAR Point Cloud Semantic Segmentation [62.258256483231484]
LiDAR point cloud semantic segmentation enables the robots to obtain fine-grained semantic information of the surrounding environment. Many works project the point cloud onto the 2D image and adopt the 2D Convolutional Neural Networks (CNNs) or vision transformer for LiDAR point cloud semantic segmentation. In this paper, we propose a novel spherical frustum structure to avoid quantized information loss.
arXiv Detail & Related papers (2023-11-29T09:55:13Z)
LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment [63.83894701779067]
We propose LCPS, the first LiDAR-Camera Panoptic network. In our approach, we conduct LiDAR-Camera fusion in three stages. Our fusion strategy improves about 6.9% PQ performance over the LiDAR-only baseline on NuScenes dataset.
arXiv Detail & Related papers (2023-08-03T10:57:58Z)
Quadric Representations for LiDAR Odometry, Mapping and Localization [93.24140840537912]
Current LiDAR odometry, mapping and localization methods leverage point-wise representations of 3D scenes. We propose a novel method of describing scenes using quadric surfaces, which are far more compact representations of 3D objects. Our method maintains low latency and memory utility while achieving competitive, and even superior, accuracy.
arXiv Detail & Related papers (2023-04-27T13:52:01Z)
(LC)$^2$: LiDAR-Camera Loop Constraints For Cross-Modal Place Recognition [0.9449650062296824]
We propose a novel cross-matching method, called (LC)$2$, for achieving LiDAR localization without a prior point cloud map. Network is trained to extract localization descriptors from disparity and range images. We demonstrate that LiDAR-based navigation systems could be optimized from image databases and vice versa.
arXiv Detail & Related papers (2023-04-17T23:20:16Z)
High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training. Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss. We reformulate both techniques based on binomial posteriors of multiple independent binary problems. This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z)
Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z)
Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation [21.337629798133324]
We propose a novel approach to semantic segmentation for LiDAR sequences named Meta-RangeSeg. A novel range residual image representation is introduced to capture the spatial-temporal information. An efficient U-Net backbone is used to obtain the multi-scale features.
arXiv Detail & Related papers (2022-02-27T14:46:13Z)
Semi-Local Convolutions for LiDAR Scan Processing [0.42970700836450487]
A number of applications, such as mobile robots or automated vehicles, use LiDAR sensors to obtain detailed information about their surroundings. Many methods use image-like projections to efficiently process these LiDAR measurements and use deep convolutional neural networks to predict semantic classes for each point in the scan. We propose semi local convolution (SLC), a convolution layer with reduced amount of weight-sharing along the vertical dimension.
arXiv Detail & Related papers (2021-11-30T18:09:43Z)
Adaptive Graph Convolution for Point Cloud Analysis [25.175406613705274]
We propose Adaptive Graph Convolution (AdaptConv) which generates adaptive kernels for points according to their dynamically learned features. Our method outperforms state-of-the-art point cloud classification and segmentation approaches on several benchmark datasets.
arXiv Detail & Related papers (2021-08-18T08:38:52Z)
LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner. We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.