Multi-Exit Semantic Segmentation Networks
- URL: http://arxiv.org/abs/2106.03527v1
- Date: Mon, 7 Jun 2021 11:37:03 GMT
- Title: Multi-Exit Semantic Segmentation Networks
- Authors: Alexandros Kouris, Stylianos I. Venieris, Stefanos Laskaridis,
Nicholas D. Lane
- Abstract summary: We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
- Score: 78.44441236864057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation arises as the backbone of many vision systems, spanning
from self-driving cars and robot navigation to augmented reality and
teleconferencing. Frequently operating under stringent latency constraints
within a limited resource envelope, optimising for efficient execution becomes
important. To this end, we propose a framework for converting state-of-the-art
segmentation models to MESS networks; specially trained CNNs that employ
parametrised early exits along their depth to save computation during inference
on easier samples. Designing and training such networks naively can hurt
performance. Thus, we propose a two-staged training process that pushes
semantically important features early in the network. We co-optimise the
number, placement and architecture of the attached segmentation heads, along
with the exit policy, to adapt to the device capabilities and
application-specific requirements. Optimising for speed, MESS networks can
achieve latency gains of up to 2.83x over state-of-the-art methods with no
accuracy degradation. Accordingly, optimising for accuracy, we achieve an
improvement of up to 5.33 pp, under the same computational budget.
Related papers
- Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - ATHEENA: A Toolflow for Hardware Early-Exit Network Automation [11.623574576259859]
A toolflow for Hardware Early-Exit Network Automation (ATHEENA)
A toolflow that leverages the probability of samples exiting early from such networks to scale the resources allocated to different sections of the network.
arXiv Detail & Related papers (2023-04-17T16:06:58Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Architecture Aware Latency Constrained Sparse Neural Networks [35.50683537052815]
In this paper, we design an architecture aware latency constrained sparse framework to prune and accelerate CNN models.
We also propose a novel sparse convolution algorithm for efficient computation.
Our system-algorithm co-design framework can achieve much better frontier among network accuracy and latency on resource-constrained mobile devices.
arXiv Detail & Related papers (2021-09-01T03:41:31Z) - DANCE: DAta-Network Co-optimization for Efficient Segmentation Model
Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference.
It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity.
Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z) - Multi-Task Network Pruning and Embedded Optimization for Real-time
Deployment in ADAS [0.0]
Camera-based Deep Learning algorithms are increasingly needed for perception in Automated Driving systems.
constraints from the automotive industry challenge the deployment of CNNs by imposing embedded systems with limited computational resources.
We propose an approach to embed a multi-task CNN network under such conditions on a commercial prototype platform.
arXiv Detail & Related papers (2021-01-19T19:29:38Z) - Deep Learning-based Resource Allocation For Device-to-Device
Communication [66.74874646973593]
We propose a framework for the optimization of the resource allocation in multi-channel cellular systems with device-to-device (D2D) communication.
A deep learning (DL) framework is proposed, where the optimal resource allocation strategy for arbitrary channel conditions is approximated by deep neural network (DNN) models.
Our simulation results confirm that near-optimal performance can be attained with low time, which underlines the real-time capability of the proposed scheme.
arXiv Detail & Related papers (2020-11-25T14:19:23Z) - AOWS: Adaptive and optimal network width search with latency constraints [30.39613826468697]
We introduce a novel efficient one-shot NAS approach to optimally search for channel numbers.
Experiments on ImageNet classification show that our approach can find networks fitting the resource constraints on different target platforms.
arXiv Detail & Related papers (2020-05-21T06:46:16Z) - Real-Time Segmentation Networks should be Latency Aware [0.0]
We argue that the commonly used performance metric of mean Intersection over Union (mIoU) does not fully capture the information required to estimate the true performance of these networks when they operate inreal-time'
We propose a change of objective in the segmentation task, and its associated metric that encapsulates this missing information in the following way: We propose to predict the future output segmentation map that will match the future input frame at the time when the network finishes the processing.
arXiv Detail & Related papers (2020-04-06T11:41:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.