EERO: Early Exit with Reject Option for Efficient Classification with limited budget
- URL: http://arxiv.org/abs/2402.03779v2
- Date: Thu, 06 Nov 2025 09:28:26 GMT
- Title: EERO: Early Exit with Reject Option for Efficient Classification with limited budget
- Authors: Florian Valade, Mohamed Hebiri, Paul Gay,
- Abstract summary: We propose EERO, a new methodology to translate the problem of early exiting to a problem of using multiple classifiers with reject option.<n>We calibrate the probabilities of exiting at the different heads using aggregation with exponential weights to guarantee a fixed budget.<n> Experimental results, conducted using a ResNet-18 model and a ConvNext architecture on Cifar and ImageNet datasets, demonstrate that our method not only effectively manages budget allocation but also enhances accuracy in overthinking scenarios.
- Score: 2.504298819189614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The increasing complexity of advanced machine learning models requires innovative approaches to manage computational resources effectively. One such method is the Early Exit strategy, which allows for adaptive computation by providing a mechanism to shorten the processing path for simpler data instances. In this paper, we propose EERO, a new methodology to translate the problem of early exiting to a problem of using multiple classifiers with reject option in order to better select the exiting head for each instance. We calibrate the probabilities of exiting at the different heads using aggregation with exponential weights to guarantee a fixed budget .We consider factors such as Bayesian risk, budget constraints, and head-specific budget consumption. Experimental results, conducted using a ResNet-18 model and a ConvNext architecture on Cifar and ImageNet datasets, demonstrate that our method not only effectively manages budget allocation but also enhances accuracy in overthinking scenarios.
Related papers
- ODAR: Principled Adaptive Routing for LLM Reasoning via Active Inference [60.958331943869126]
ODAR-Expert is an adaptive routing framework that optimize the accuracy-efficiency trade-off via principled resource allocation.<n>We show strong and consistent gains, including 98.2% accuracy on MATH and 54.8% on Humanity's Last Exam.
arXiv Detail & Related papers (2026-02-27T05:22:01Z) - Conformal Thinking: Risk Control for Reasoning on a Compute Budget [60.65072883773352]
Reasoning Large Language Models (LLMs) enable test-time scaling, with dataset-level accuracy improving as the token budget increases.<n>We re-frame the budget setting problem as risk control, limiting the error rate while minimizing compute.<n>Our framework introduces an upper threshold that stops reasoning when the model is confident and a novel lower threshold that preemptively stops unsolvable instances.
arXiv Detail & Related papers (2026-02-03T18:17:22Z) - Labels or Preferences? Budget-Constrained Learning with Human Judgments over AI-Generated Outputs [17.028710603629026]
We show how to optimally allocate a fixed annotation budget between ground-truth labels and pairwise preferences in AI.<n>We introduce Preference-Calibrated Active Learning (PCAL), a novel robustness method that learns optimal data acquisition strategy.<n>This work provides a principled and statistically efficient approach for budget-constrained learning in modern AI.
arXiv Detail & Related papers (2026-01-19T23:23:29Z) - When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning [90.5036809670993]
Scaling test-time compute has emerged as a key strategy for enhancing the reasoning capabilities of large language models.
Recent advancements in Generative Reward Models (GenRM) reframe verification as a next-token prediction task.
We evaluate GenRM against Self-Consistency (SC) for most practical inference budgets across diverse models and datasets.
arXiv Detail & Related papers (2025-04-01T17:41:57Z) - Towards An Unsupervised Learning Scheme for Efficiently Solving Parameterized Mixed-Integer Programs [6.1860817947800655]
We train an autoencoder for binary variables in an unsupervised learning fashion.
We present a strategy to construct a class of cutting plane constraints from the decoder parameters of an offline-trained AE.
Their integration into the primal MIP problem leads to a tightened MIP with the reduced feasible region.
arXiv Detail & Related papers (2024-12-23T14:48:32Z) - New Additive OCBA Procedures for Robust Ranking and Selection [0.9558392439655016]
We develop new fixed-budget robust R&S procedures to minimize the probability of incorrect selection under a limited sampling budget.
We then conduct a comprehensive numerical study to verify the superiority of our robust OCBA procedure over existing ones.
arXiv Detail & Related papers (2024-12-08T18:42:07Z) - Minimax and Communication-Efficient Distributed Best Subset Selection with Oracle Property [0.358439716487063]
The explosion of large-scale data has outstripped the processing capabilities of single-machine systems.
Traditional approaches to distributed inference often struggle with achieving true sparsity in high-dimensional datasets.
We propose a novel, two-stage, distributed best subset selection algorithm to address these issues.
arXiv Detail & Related papers (2024-08-30T13:22:08Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance.
We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation.
We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z) - Balancing Risk and Reward: An Automated Phased Release Strategy [2.1485350418225244]
Phased releases are a common strategy in the technology industry for gradually releasing new products or updates through a sequence of A/B tests.
We propose an algorithm that automatically determines the release percentage at each stage in the schedule, balancing the need to control risk while maximizing ramp-up speed.
arXiv Detail & Related papers (2023-05-16T17:27:34Z) - Stochastic Methods for AUC Optimization subject to AUC-based Fairness
Constraints [51.12047280149546]
A direct approach for obtaining a fair predictive model is to train the model through optimizing its prediction performance subject to fairness constraints.
We formulate the training problem of a fairness-aware machine learning model as an AUC optimization problem subject to a class of AUC-based fairness constraints.
We demonstrate the effectiveness of our approach on real-world data under different fairness metrics.
arXiv Detail & Related papers (2022-12-23T22:29:08Z) - Exploiting Temporal Structures of Cyclostationary Signals for
Data-Driven Single-Channel Source Separation [98.95383921866096]
We study the problem of single-channel source separation (SCSS)
We focus on cyclostationary signals, which are particularly suitable in a variety of application domains.
We propose a deep learning approach using a U-Net architecture, which is competitive with the minimum MSE estimator.
arXiv Detail & Related papers (2022-08-22T14:04:56Z) - Budgeted Classification with Rejection: An Evolutionary Method with
Multiple Objectives [0.0]
Budgeted, sequential classifiers (BSCs) process inputs through a sequence of partial feature acquisition and evaluation steps.
This allows for an efficient evaluation of inputs that prevents unneeded feature acquisition.
We propose a problem-specific genetic algorithm to build budgeted, sequential classifiers with confidence-based reject options.
arXiv Detail & Related papers (2022-05-01T22:05:16Z) - Error-based Knockoffs Inference for Controlled Feature Selection [49.99321384855201]
We propose an error-based knockoff inference method by integrating the knockoff features, the error-based feature importance statistics, and the stepdown procedure together.
The proposed inference procedure does not require specifying a regression model and can handle feature selection with theoretical guarantees.
arXiv Detail & Related papers (2022-03-09T01:55:59Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.