Adaptive Inference through Early-Exit Networks: Design, Challenges and
Directions
- URL: http://arxiv.org/abs/2106.05022v1
- Date: Wed, 9 Jun 2021 12:33:02 GMT
- Title: Adaptive Inference through Early-Exit Networks: Design, Challenges and
Directions
- Authors: Stefanos Laskaridis, Alexandros Kouris, Nicholas D. Lane
- Abstract summary: We decompose the design methodology of early-exit networks to its key components and survey the recent advances in each one of them.
We position early-exiting against other efficient inference solutions and provide our insights on the current challenges and most promising future directions for research in the field.
- Score: 80.78077900288868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: DNNs are becoming less and less over-parametrised due to recent advances in
efficient model design, through careful hand-crafted or NAS-based methods.
Relying on the fact that not all inputs require the same amount of computation
to yield a confident prediction, adaptive inference is gaining attention as a
prominent approach for pushing the limits of efficient deployment.
Particularly, early-exit networks comprise an emerging direction for tailoring
the computation depth of each input sample at runtime, offering complementary
performance gains to other efficiency optimisations. In this paper, we
decompose the design methodology of early-exit networks to its key components
and survey the recent advances in each one of them. We also position
early-exiting against other efficient inference solutions and provide our
insights on the current challenges and most promising future directions for
research in the field.
Related papers
- Diffusion Models as Network Optimizers: Explorations and Analysis [71.69869025878856]
generative diffusion models (GDMs) have emerged as a promising new approach to network optimization.
In this study, we first explore the intrinsic characteristics of generative models.
We provide a concise theoretical and intuitive demonstration of the advantages of generative models over discriminative network optimization.
arXiv Detail & Related papers (2024-11-01T09:05:47Z) - Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - An Efficient Learning-based Solver Comparable to Metaheuristics for the
Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics.
First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process.
Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z) - Low Complexity Adaptive Machine Learning Approaches for End-to-End
Latency Prediction [0.0]
This work is the design of efficient, low-cost adaptive algorithms for estimation, monitoring and prediction.
We focus on end-to-end latency prediction, for which we illustrate our approaches and results on data obtained from a public generator provided after the recent international challenge on GNN.
arXiv Detail & Related papers (2023-01-31T10:29:11Z) - Resource-Constrained Edge AI with Early Exit Prediction [5.060405696893342]
We propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system.
Specifically, we design a low-complexity module, namely the Exit Predictor, to guide some distinctly "hard" samples to bypass the computation of the early exits.
Considering the varying communication bandwidth, we extend the early exit prediction mechanism for latency-aware edge inference.
arXiv Detail & Related papers (2022-06-15T03:14:21Z) - RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries.
A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function.
Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z) - Improving Online Performance Prediction for Semantic Segmentation [29.726236358091295]
We address the task of observing the performance of a semantic segmentation deep neural network (DNN) during online operation.
Many high-level decisions rely on such DNNs, which are usually evaluated offline, while their performance in online operation remains unknown.
We propose an improved online performance prediction scheme, building on a recently proposed concept of predicting the primary semantic segmentation task's performance.
arXiv Detail & Related papers (2021-04-12T07:44:40Z) - An AI-Assisted Design Method for Topology Optimization Without
Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way.
Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z) - HAPI: Hardware-Aware Progressive Inference [18.214367595727037]
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks.
Despite their popularity, CNN inference still comes at a high computational cost.
This work presents HAPI, a novel methodology for generating high-performance early-exit networks.
arXiv Detail & Related papers (2020-08-10T09:55:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.