Related papers: Efficient 3D Recognition with Event-driven Spike Sparse Convolution

Efficient 3D Recognition with Event-driven Spike Sparse Convolution

URL: http://arxiv.org/abs/2412.07360v2
Date: Tue, 04 Feb 2025 02:52:37 GMT
Title: Efficient 3D Recognition with Event-driven Spike Sparse Convolution
Authors: Xuerui Qiu, Man Yao, Jieyuan Zhang, Yuhong Chou, Ning Qiao, Shibo Zhou, Bo Xu, Guoqi Li,
Abstract summary: Spiking Neural Networks (SNNs) provide an energy-efficient way to extract 3D-temporal features.<n>We introduce the Spike Voxel Coding (SVC) scheme, which encodes the 3D point clouds into a sparse spike train space.<n>We propose a Spike Sparse Convolution (SSC) model for efficiently extracting 3D sparse point cloud features.
Score: 15.20476631850388
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Spiking Neural Networks (SNNs) provide an energy-efficient way to extract 3D spatio-temporal features. Point clouds are sparse 3D spatial data, which suggests that SNNs should be well-suited for processing them. However, when applying SNNs to point clouds, they often exhibit limited performance and fewer application scenarios. We attribute this to inappropriate preprocessing and feature extraction methods. To address this issue, we first introduce the Spike Voxel Coding (SVC) scheme, which encodes the 3D point clouds into a sparse spike train space, reducing the storage requirements and saving time on point cloud preprocessing. Then, we propose a Spike Sparse Convolution (SSC) model for efficiently extracting 3D sparse point cloud features. Combining SVC and SSC, we design an efficient 3D SNN backbone (E-3DSNN), which is friendly with neuromorphic hardware. For instance, SSC can be implemented on neuromorphic chips with only minor modifications to the addressing function of vanilla spike convolution. Experiments on ModelNet40, KITTI, and Semantic KITTI datasets demonstrate that E-3DSNN achieves state-of-the-art (SOTA) results with remarkable efficiency. Notably, our E-3DSNN (1.87M) obtained 91.7\% top-1 accuracy on ModelNet40, surpassing the current best SNN baselines (14.3M) by 3.0\%. To our best knowledge, it is the first direct training 3D SNN backbone that can simultaneously handle various 3D computer vision tasks (e.g., classification, detection, and segmentation) with an event-driven nature. Code is available: https://github.com/bollossom/E-3DSNN/.

Related papers

Noise-Injected Spiking Graph Convolution for Energy-Efficient 3D Point Cloud Denoising [8.698008330627536]
Spiking neural networks (SNNs) have exhibited superior energy efficiency in 2D classification tasks over traditional artificial neural networks (ANNs) We propose noise-injected spiking graph convolutional networks to leverage the full regression potential of SNNs in 3D point cloud denoising.
arXiv Detail & Related papers (2025-02-27T01:04:23Z)
Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training [17.193023656793464]
The ambition of brain-inspired Spiking Neural Networks (SNNs) is to become a low-power alternative to traditional Artificial Neural Networks (ANNs) This work addresses two major challenges in realizing this vision: the performance gap between SNNs and ANNs, and the high training costs of SNNs. We identify intrinsic flaws in spiking neurons caused by binary firing mechanisms and propose a Spike Firing Approximation (SFA) method using integer training and spike-driven inference.
arXiv Detail & Related papers (2024-11-25T03:05:41Z)
LION: Linear Group RNN for 3D Object Detection in Point Clouds [85.97541374148508]
We propose a window-based framework built on LInear grOup RNN for accurate 3D object detection, called LION. We introduce a 3D spatial feature descriptor and integrate it into the linear group RNN operators to enhance their spatial features. To further address the challenge in highly sparse point clouds, we propose a 3D voxel generation strategy to densify foreground features.
arXiv Detail & Related papers (2024-07-25T17:50:32Z)
ANN vs SNN: A case study for Neural Decoding in Implantable Brain-Machine Interfaces [0.7904805552920349]
In this work, we compare different neural networks (NN) for motor decoding in terms of accuracy and implementation cost. We further show that combining traditional signal processing techniques with machine learning ones deliver surprisingly good performance even with simple NNs.
arXiv Detail & Related papers (2023-12-26T05:40:39Z)
MLGCN: An Ultra Efficient Graph Convolution Neural Model For 3D Point Cloud Analysis [4.947552172739438]
We introduce a novel Multi-level Graph Convolution Neural (MLGCN) model, which uses Graph Neural Networks (GNN) blocks to extract features from 3D point clouds at specific locality levels. Our approach produces comparable results to those of state-of-the-art models while requiring up to a thousand times fewer floating-point operations (FLOPs) and having significantly reduced storage requirements.
arXiv Detail & Related papers (2023-03-31T00:15:22Z)
Using a Waffle Iron for Automotive Point Cloud Semantic Segmentation [66.6890991207065]
Sparse 3D convolutions have become the de-facto tools to construct deep neural networks. We propose an alternative method that reaches the level of state-of-the-art methods without requiring sparse convolutions. We show that such level of performance is achievable by relying on tools a priori unfit for large scale and high-performing 3D perception.
arXiv Detail & Related papers (2023-01-24T16:10:08Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z)
3D CNNs with Adaptive Temporal Feature Resolutions [83.43776851586351]
Similarity Guided Sampling (SGS) module can be plugged into any existing 3D CNN architecture. SGS empowers 3D CNNs by learning the similarity of temporal features and grouping similar features together. Our evaluations show that the proposed module improves the state-of-the-art by reducing the computational cost (GFLOPs) by half while preserving or even improving the accuracy.
arXiv Detail & Related papers (2020-11-17T14:34:05Z)
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection [76.30585706811993]
We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN) Our proposed method deeply integrates both 3D voxel Convolutional Neural Network (CNN) and PointNet-based set abstraction. It takes advantages of efficient learning and high-quality proposals of the 3D voxel CNN and the flexible receptive fields of the PointNet-based networks.
arXiv Detail & Related papers (2019-12-31T06:34:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.