A Near Sensor Edge Computing System for Point Cloud Semantic
Segmentation
- URL: http://arxiv.org/abs/2207.05888v1
- Date: Tue, 12 Jul 2022 23:32:11 GMT
- Title: A Near Sensor Edge Computing System for Point Cloud Semantic
Segmentation
- Authors: Lin Bai, Yiming Zhao and Xinming Huang
- Abstract summary: We propose a light weighted point cloud semantic segmentation network based on range view.
Our network achieved 10 frame per second (fps) on Xilinx DPU with computation efficiency 42.5 GOP/W.
- Score: 12.997562735505364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Point cloud semantic segmentation has attracted attentions due to its
robustness to light condition. This makes it an ideal semantic solution for
autonomous driving. However, considering the large computation burden and
bandwidth demanding of neural networks, putting all the computing into vehicle
Electronic Control Unit (ECU) is not efficient or practical. In this paper, we
proposed a light weighted point cloud semantic segmentation network based on
range view. Due to its simple pre-processing and standard convolution, it is
efficient when running on deep learning accelerator like DPU. Furthermore, a
near sensor computing system is built for autonomous vehicles. In this system,
a FPGA-based deep learning accelerator core (DPU) is placed next to the LiDAR
sensor, to perform point cloud pre-processing and segmentation neural network.
By leaving only the post-processing step to ECU, this solution heavily
alleviate the computation burden of ECU and consequently shortens the decision
making and vehicles reaction latency. Our semantic segmentation network
achieved 10 frame per second (fps) on Xilinx DPU with computation efficiency
42.5 GOP/W.
Related papers
- Latency optimized Deep Neural Networks (DNNs): An Artificial Intelligence approach at the Edge using Multiprocessor System on Chip (MPSoC) [1.949471382288103]
Edge computing (AI at Edge) in mobile devices is one of the optimized approaches for addressing this requirement.
In this work, the possibilities and challenges of implementing a low-latency and power-optimized smart mobile system are examined.
Various performance aspects and implementation feasibilities of Neural Networks (NNs) on both embedded FPGA edge devices are discussed.
arXiv Detail & Related papers (2024-07-16T11:51:41Z) - Implementation of a perception system for autonomous vehicles using a
detection-segmentation network in SoC FPGA [0.0]
We have used the MultiTaskV3 detection-segmentation network as the basis for a perception system that can perform both functionalities within a single architecture.
The whole system consumes relatively little power compared to a CPU-based implementation.
It also achieves an accuracy higher than 97% of the mAP for object detection and above 90% of the mIoU for image segmentation.
arXiv Detail & Related papers (2023-07-17T17:44:18Z) - Analyzing Deep Learning Representations of Point Clouds for Real-Time
In-Vehicle LiDAR Perception [2.365702128814616]
We propose a novel computational taxonomy of LiDAR point cloud representations used in modern deep neural networks for 3D point cloud processing.
Thereby, we uncover common advantages and limitations in terms of computational efficiency, memory requirements, and representational capacity.
arXiv Detail & Related papers (2022-10-26T10:39:59Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - An Efficient Deep Learning Approach Using Improved Generative
Adversarial Networks for Incomplete Information Completion of Self-driving [2.8504921333436832]
We propose an efficient deep learning approach to repair incomplete vehicle point cloud accurately and efficiently in autonomous driving.
The improved PF-Net can achieve the speedups of over 19x with almost the same accuracy when compared to the original PF-Net.
arXiv Detail & Related papers (2021-09-01T08:06:23Z) - Multi-scale Interaction for Real-time LiDAR Data Segmentation on an
Embedded Platform [62.91011959772665]
Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles.
Current approaches that operate directly on the point cloud use complex spatial aggregation operations.
We propose a projection-based method, called Multi-scale Interaction Network (MINet), which is very efficient and accurate.
arXiv Detail & Related papers (2020-08-20T19:06:11Z) - Binary DAD-Net: Binarized Driveable Area Detection Network for
Autonomous Driving [94.40107679615618]
This paper proposes a novel binarized driveable area detection network (binary DAD-Net)
It uses only binary weights and activations in the encoder, the bottleneck, and the decoder part.
It outperforms state-of-the-art semantic segmentation networks on public datasets.
arXiv Detail & Related papers (2020-06-15T07:09:01Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z) - Key Points Estimation and Point Instance Segmentation Approach for Lane
Detection [65.37887088194022]
We propose a traffic line detection method called Point Instance Network (PINet)
The PINet includes several stacked hourglass networks that are trained simultaneously.
The PINet achieves competitive accuracy and false positive on the TuSimple and Culane datasets.
arXiv Detail & Related papers (2020-02-16T15:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.