Related papers: Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions

Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions

URL: http://arxiv.org/abs/2106.05022v1
Date: Wed, 9 Jun 2021 12:33:02 GMT
Title: Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions
Authors: Stefanos Laskaridis, Alexandros Kouris, Nicholas D. Lane
Abstract summary: We decompose the design methodology of early-exit networks to its key components and survey the recent advances in each one of them. We position early-exiting against other efficient inference solutions and provide our insights on the current challenges and most promising future directions for research in the field.
Score: 80.78077900288868
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: DNNs are becoming less and less over-parametrised due to recent advances in efficient model design, through careful hand-crafted or NAS-based methods. Relying on the fact that not all inputs require the same amount of computation to yield a confident prediction, adaptive inference is gaining attention as a prominent approach for pushing the limits of efficient deployment. Particularly, early-exit networks comprise an emerging direction for tailoring the computation depth of each input sample at runtime, offering complementary performance gains to other efficiency optimisations. In this paper, we decompose the design methodology of early-exit networks to its key components and survey the recent advances in each one of them. We also position early-exiting against other efficient inference solutions and provide our insights on the current challenges and most promising future directions for research in the field.

Related papers

Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z)
Offline Model-Based Optimization: Comprehensive Review [61.91350077539443]
offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. Recent advances in model-based optimization have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review.
arXiv Detail & Related papers (2025-03-21T16:35:02Z)
Diffusion Models as Network Optimizers: Explorations and Analysis [71.69869025878856]
generative diffusion models (GDMs) have emerged as a promising new approach to network optimization. In this study, we first explore the intrinsic characteristics of generative models. We provide a concise theoretical and intuitive demonstration of the advantages of generative models over discriminative network optimization.
arXiv Detail & Related papers (2024-11-01T09:05:47Z)
Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems. This work considers AD in network flows using incomplete measurements. We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective. Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z)
Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box. In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z)
An Efficient Learning-based Solver Comparable to Metaheuristics for the Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics. First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process. Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z)
Low Complexity Adaptive Machine Learning Approaches for End-to-End Latency Prediction [0.0]
This work is the design of efficient, low-cost adaptive algorithms for estimation, monitoring and prediction. We focus on end-to-end latency prediction, for which we illustrate our approaches and results on data obtained from a public generator provided after the recent international challenge on GNN.
arXiv Detail & Related papers (2023-01-31T10:29:11Z)
Resource-Constrained Edge AI with Early Exit Prediction [5.060405696893342]
We propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system. Specifically, we design a low-complexity module, namely the Exit Predictor, to guide some distinctly "hard" samples to bypass the computation of the early exits. Considering the varying communication bandwidth, we extend the early exit prediction mechanism for latency-aware edge inference.
arXiv Detail & Related papers (2022-06-15T03:14:21Z)
RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries. A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function. Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z)
Improving Online Performance Prediction for Semantic Segmentation [29.726236358091295]
We address the task of observing the performance of a semantic segmentation deep neural network (DNN) during online operation. Many high-level decisions rely on such DNNs, which are usually evaluated offline, while their performance in online operation remains unknown. We propose an improved online performance prediction scheme, building on a recently proposed concept of predicting the primary semantic segmentation task's performance.
arXiv Detail & Related papers (2021-04-12T07:44:40Z)
An AI-Assisted Design Method for Topology Optimization Without Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way. Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z)
HAPI: Hardware-Aware Progressive Inference [18.214367595727037]
Convolutional neural networks (CNNs) have recently become the state-of-the-art in a diversity of AI tasks. Despite their popularity, CNN inference still comes at a high computational cost. This work presents HAPI, a novel methodology for generating high-performance early-exit networks.
arXiv Detail & Related papers (2020-08-10T09:55:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.