Towards Anytime Classification in Early-Exit Architectures by Enforcing
Conditional Monotonicity
- URL: http://arxiv.org/abs/2306.02652v2
- Date: Sun, 29 Oct 2023 18:35:01 GMT
- Title: Towards Anytime Classification in Early-Exit Architectures by Enforcing
Conditional Monotonicity
- Authors: Metod Jazbec, James Urquhart Allingham, Dan Zhang, Eric Nalisnick
- Abstract summary: Anytime algorithms are well-suited to environments in which computational budgets are dynamic.
We show that current early-exit networks are not directly applicable to anytime settings.
We propose an elegant post-hoc modification, based on the Product-of-Experts, that encourages an early-exit network to become gradually confident.
- Score: 5.425028186820756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern predictive models are often deployed to environments in which
computational budgets are dynamic. Anytime algorithms are well-suited to such
environments as, at any point during computation, they can output a prediction
whose quality is a function of computation time. Early-exit neural networks
have garnered attention in the context of anytime computation due to their
capability to provide intermediate predictions at various stages throughout the
network. However, we demonstrate that current early-exit networks are not
directly applicable to anytime settings, as the quality of predictions for
individual data points is not guaranteed to improve with longer computation. To
address this shortcoming, we propose an elegant post-hoc modification, based on
the Product-of-Experts, that encourages an early-exit network to become
gradually confident. This gives our deep models the property of conditional
monotonicity in the prediction quality -- an essential stepping stone towards
truly anytime predictive modeling using early-exit architectures. Our empirical
results on standard image-classification tasks demonstrate that such behaviors
can be achieved while preserving competitive accuracy on average.
Related papers
- Physics-guided Active Sample Reweighting for Urban Flow Prediction [75.24539704456791]
Urban flow prediction is a nuanced-temporal modeling that estimates the throughput of transportation services like buses, taxis and ride-driven models.
Some recent prediction solutions bring remedies with the notion of physics-guided machine learning (PGML)
We develop a atized physics-guided network (PN), and propose a data-aware framework Physics-guided Active Sample Reweighting (P-GASR)
arXiv Detail & Related papers (2024-07-18T15:44:23Z) - Loss Shaping Constraints for Long-Term Time Series Forecasting [79.3533114027664]
We present a Constrained Learning approach for long-term time series forecasting that respects a user-defined upper bound on the loss at each time-step.
We propose a practical Primal-Dual algorithm to tackle it, and aims to demonstrate that it exhibits competitive average performance in time series benchmarks, while shaping the errors across the predicted window.
arXiv Detail & Related papers (2024-02-14T18:20:44Z) - An Adaptive Framework for Generalizing Network Traffic Prediction
towards Uncertain Environments [51.99765487172328]
We have developed a new framework using time-series analysis for dynamically assigning mobile network traffic prediction models.
Our framework employs learned behaviors, outperforming any single model with over a 50% improvement relative to current studies.
arXiv Detail & Related papers (2023-11-30T18:58:38Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Meta-Forecasting by combining Global DeepRepresentations with Local
Adaptation [12.747008878068314]
We introduce a novel forecasting method called Meta Global-Local Auto-Regression (Meta-GLAR)
It adapts to each time series by learning in closed-form the mapping from the representations produced by a recurrent neural network (RNN) to one-step-ahead forecasts.
Our method is competitive with the state-of-the-art in out-of-sample forecasting accuracy reported in earlier work.
arXiv Detail & Related papers (2021-11-05T11:45:02Z) - Confidence Adaptive Anytime Pixel-Level Recognition [86.75784498879354]
Anytime inference requires a model to make a progression of predictions which might be halted at any time.
We propose the first unified and end-to-end model approach for anytime pixel-level recognition.
arXiv Detail & Related papers (2021-04-01T20:01:57Z) - Predicting Temporal Sets with Deep Neural Networks [50.53727580527024]
We propose an integrated solution based on the deep neural networks for temporal sets prediction.
A unique perspective is to learn element relationship by constructing set-level co-occurrence graph.
We design an attention-based module to adaptively learn the temporal dependency of elements and sets.
arXiv Detail & Related papers (2020-06-20T03:29:02Z) - A machine learning approach for forecasting hierarchical time series [4.157415305926584]
We propose a machine learning approach for forecasting hierarchical time series.
Forecast reconciliation is the process of adjusting forecasts to make them coherent across the hierarchy.
We exploit the ability of a deep neural network to extract information capturing the structure of the hierarchy.
arXiv Detail & Related papers (2020-05-31T22:26:16Z) - Predictive Business Process Monitoring via Generative Adversarial Nets:
The Case of Next Event Prediction [0.026249027950824504]
This paper proposes a novel adversarial training framework to address the problem of next event prediction.
It works by putting one neural network against the other in a two-player game which leads to predictions that are indistinguishable from the ground truth.
It systematically outperforms all baselines both in terms of accuracy and earliness of the prediction, despite using a simple network architecture and a naive feature encoding.
arXiv Detail & Related papers (2020-03-25T08:31:28Z) - A clustering approach to time series forecasting using neural networks:
A comparative study on distance-based vs. feature-based clustering methods [1.256413718364189]
We propose various neural network architectures to forecast the time series data using the dynamic measurements.
We also investigate the importance of performing techniques such as anomaly detection and clustering on forecasting accuracy.
Our results indicate that clustering can improve the overall prediction time as well as improve the forecasting performance of the neural network.
arXiv Detail & Related papers (2020-01-27T00:31:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.