LRNNet: A Light-Weighted Network with Efficient Reduced Non-Local
Operation for Real-Time Semantic Segmentation
- URL: http://arxiv.org/abs/2006.02706v1
- Date: Thu, 4 Jun 2020 08:55:15 GMT
- Title: LRNNet: A Light-Weighted Network with Efficient Reduced Non-Local
Operation for Real-Time Semantic Segmentation
- Authors: Weihao Jiang and Zhaozhi Xie and Yaoyi Li and Chang Liu and Hongtao Lu
- Abstract summary: This paper introduces a light-weighted network with an efficient reduced non-local module (LRNNet) for efficient and realtime semantic segmentation.
Experiments demonstrate our superior trade-off among light-weight, speed, computation and accuracy.
- Score: 15.010572800399057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent development of light-weighted neural networks has promoted the
applications of deep learning under resource constraints and mobile
applications. Many of these applications need to perform a real-time and
efficient prediction for semantic segmentation with a light-weighted network.
This paper introduces a light-weighted network with an efficient reduced
non-local module (LRNNet) for efficient and realtime semantic segmentation. We
proposed a factorized convolutional block in ResNet-Style encoder to achieve
more lightweighted, efficient and powerful feature extraction. Meanwhile, our
proposed reduced non-local module utilizes spatial regional dominant singular
vectors to achieve reduced and more representative non-local feature
integration with much lower computation and memory cost. Experiments
demonstrate our superior trade-off among light-weight, speed, computation and
accuracy. Without additional processing and pretraining, LRNNet achieves 72.2%
mIoU on Cityscapes test dataset only using the fine annotation data for
training with only 0.68M parameters and with 71 FPS on a GTX 1080Ti card.
Related papers
- Accelerating Linear Recurrent Neural Networks for the Edge with Unstructured Sparsity [39.483346492111515]
Linear recurrent neural networks enable powerful long-range sequence modeling with constant memory usage and time-per-token during inference.
Unstructured sparsity offers a compelling solution, enabling substantial reductions in compute and memory requirements when accelerated by compatible hardware platforms.
We find that highly sparse linear RNNs consistently achieve better efficiency-performance trade-offs than dense baselines.
arXiv Detail & Related papers (2025-02-03T13:09:21Z) - Efficient Event-based Semantic Segmentation with Spike-driven Lightweight Transformer-based Networks [7.234661153788162]
Event-based semantic segmentation has great potential in autonomous driving and robotics.
Current artificial neural network (ANN)-based segmentation methods suffer from high computational demands, the requirements for image frames, and massive energy consumption.
We introduce SLTNet, a spike-driven lightweight transformer-based network designed for event-based semantic segmentation.
arXiv Detail & Related papers (2024-12-17T12:11:04Z) - Optimal Gradient Checkpointing for Sparse and Recurrent Architectures using Off-Chip Memory [0.8321953606016751]
We introduce memory-efficient gradient checkpointing strategies tailored for the general class of sparse RNNs and Spiking Neural Networks.
We find that Double Checkpointing emerges as the most effective method, optimizing the use of local memory resources while minimizing recomputation overhead.
arXiv Detail & Related papers (2024-12-16T14:23:31Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - RAMAN: A Re-configurable and Sparse tinyML Accelerator for Inference on
Edge [1.8293684411977293]
Deep Neural Network (DNN) based inference at the edge is challenging as these compute and data-intensive algorithms need to be implemented at low cost and low power.
We present RAMAN, a Re-configurable and spArse tinyML Accelerator for infereNce on edge, architected to exploit the sparsity to reduce area (storage), power as well as latency.
arXiv Detail & Related papers (2023-06-10T17:25:58Z) - Lightweight Real-time Semantic Segmentation Network with Efficient
Transformer and CNN [34.020978009518245]
We propose a lightweight real-time semantic segmentation network called LETNet.
LETNet combines a U-shaped CNN with Transformer effectively in a capsule embedding style to compensate for respective deficiencies.
Experiments performed on challenging datasets demonstrate that LETNet achieves superior performances in accuracy and efficiency balance.
arXiv Detail & Related papers (2023-02-21T07:16:53Z) - Lightweight and Progressively-Scalable Networks for Semantic
Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation.
In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales.
We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Real-time Semantic Segmentation via Spatial-detail Guided Context
Propagation [49.70144583431999]
We propose the spatial-detail guided context propagation network (SGCPNet) for achieving real-time semantic segmentation.
It uses the spatial details of shallow layers to guide the propagation of the low-resolution global contexts, in which the lost spatial information can be effectively reconstructed.
It achieves 69.5% mIoU segmentation accuracy, while its speed reaches 178.5 FPS on 768x1536 images on a GeForce GTX 1080 Ti GPU card.
arXiv Detail & Related papers (2020-05-22T07:07:26Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.