Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE
- URL: http://arxiv.org/abs/2406.14404v1
- Date: Thu, 20 Jun 2024 15:25:13 GMT
- Title: Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE
- Authors: Florence Regol, Joud Chataoui, Bertrand Charpentier, Mark Coates, Pablo Piantanida, Stephan Gunnemann,
- Abstract summary: We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE.
Our algorithm can be seen as a form of soft early exiting or input-dependent compression.
The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
- Score: 68.6018458996143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models can solve complex tasks but often require significant computational resources during inference. This has led to the development of various post-training computation reduction methods that tackle this issue in different ways, such as quantization which reduces the precision of weights and arithmetic operations, and dynamic networks which adapt computation to the sample at hand. In this work, we propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. Rather than a binary decision between exiting or continuing, we introduce the possibility of continuing with reduced computation. This complicates the traditionally considered early exiting problem, which we solve through a principled formulation. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation. We demonstrate the effectiveness of our method through empirical evaluation, as well as exploring the conditions for its success on 4 classification datasets.
Related papers
- A resource-efficient model for deep kernel learning [0.0]
There are various approaches for accelerate learning computations with minimal loss of accuracy.
We describe a model-level decomposition approach that combines both the decomposition of the operators and the decomposition of the network.
We perform a feasibility analysis on the resulting algorithm, both in terms of its accuracy and scalability.
arXiv Detail & Related papers (2024-10-13T17:11:42Z) - Successive Refinement in Large-Scale Computation: Advancing Model
Inference Applications [67.76749044675721]
We introduce solutions for layered-resolution computation.
These solutions allow lower-resolution results to be obtained at an earlier stage than the final result.
arXiv Detail & Related papers (2024-02-11T15:36:33Z) - On efficient computation in active inference [1.1470070927586016]
We present a novel planning algorithm for finite temporal horizons with drastically lower computational complexity.
We also simplify the process of setting an appropriate target distribution for new and existing active inference planning schemes.
arXiv Detail & Related papers (2023-07-02T07:38:56Z) - Predictive Coding beyond Correlations [59.47245250412873]
We show how one of such algorithms, called predictive coding, is able to perform causal inference tasks.
First, we show how a simple change in the inference process of predictive coding enables to compute interventions without the need to mutilate or redefine a causal graph.
arXiv Detail & Related papers (2023-06-27T13:57:16Z) - Computing large deviation prefactors of stochastic dynamical systems
based on machine learning [4.474127100870242]
We present large deviation theory that characterizes the exponential estimate for rare events of dynamical systems in the limit of weak noise.
We design a neural network framework to compute quasipotential, most probable paths and prefactors based on the decomposition of vector field.
Numerical experiments demonstrate its powerful function in exploring internal mechanism of rare events triggered by weak random fluctuations.
arXiv Detail & Related papers (2023-06-20T09:59:45Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - EQ-Net: A Unified Deep Learning Framework for Log-Likelihood Ratio
Estimation and Quantization [25.484585922608193]
We introduce EQ-Net: the first holistic framework that solves both the tasks of log-likelihood ratio (LLR) estimation and quantization using a data-driven method.
We carry out extensive experimental evaluation and demonstrate that our single architecture achieves state-of-the-art results on both tasks.
arXiv Detail & Related papers (2020-12-23T18:11:30Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.