You Only Look Once at Anytime (AnytimeYOLO): Analysis and Optimization of Early-Exits for Object-Detection
- URL: http://arxiv.org/abs/2503.17497v1
- Date: Fri, 21 Mar 2025 19:16:38 GMT
- Title: You Only Look Once at Anytime (AnytimeYOLO): Analysis and Optimization of Early-Exits for Object-Detection
- Authors: Daniel Kuhse, Harun Teper, Sebastian Buschjäger, Chien-Yao Wang, Jian-Jia Chen,
- Abstract summary: We introduce AnytimeYOLO, a family of variants of the YOLO architecture that enables anytime object detection.<n>Our AnytimeYOLO networks allow for interruptible inference, i.e., they provide a prediction at any point in time, a property desirable for safety-critical real-time applications.
- Score: 7.980812989961303
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce AnytimeYOLO, a family of variants of the YOLO architecture that enables anytime object detection. Our AnytimeYOLO networks allow for interruptible inference, i.e., they provide a prediction at any point in time, a property desirable for safety-critical real-time applications. We present structured explorations to modify the YOLO architecture, enabling early termination to obtain intermediate results. We focus on providing fine-grained control through high granularity of available termination points. First, we formalize Anytime Models as a special class of prediction models that offer anytime predictions. Then, we discuss a novel transposed variant of the YOLO architecture, that changes the architecture to enable better early predictions and greater freedom for the order of processing stages. Finally, we propose two optimization algorithms that, given an anytime model, can be used to determine the optimal exit execution order and the optimal subset of early-exits to select for deployment in low-resource environments. We evaluate the anytime performance and trade-offs of design choices, proposing a new anytime quality metric for this purpose. In particular, we also discuss key challenges for anytime inference that currently make its deployment costly.
Related papers
- Decision-Focused Fine-Tuning of Time Series Foundation Models for Dispatchable Feeder Optimization [0.5808168734833972]
We use decision-focused fine-tuning within time series foundation models to offer a scalable and efficient solution for the dispatchable feeder optimization problem.<n>To obtain more robust predictions for scarce building data, we use Moirai as a state-of-the-art foundation model.<n>Comparing the decision-focused fine-tuned Moirai with a state-of-the-art classical prediction-focused fine-tuning Morai, we observe an improvement of 9.45% in average total daily costs.
arXiv Detail & Related papers (2025-03-03T07:47:20Z) - Automatic selection of the best neural architecture for time series forecasting via multi-objective optimization and Pareto optimality conditions [1.4843690728082002]
Time series forecasting plays a pivotal role in a wide range of applications, including weather prediction, healthcare, structural health monitoring, predictive maintenance, energy systems, and financial markets.<n>While models such as LSTM, GRU, Transformers, and State-Space Models (SSMs) have become standard tools in this domain, selecting the optimal architecture remains a challenge.<n>We introduce a flexible automated framework for time series forecasting that integrates LSTM, GRU, multi-head Attention, and SSM blocks.
arXiv Detail & Related papers (2025-01-21T15:33:55Z) - Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal Prediction [26.69233687863233]
Urban-temporal prediction is crucial for informed decision-making, such as traffic management, resource optimization, emergence response.
We introduce UniST, a universal model designed for general urban-temporal prediction across wide range of scenarios by large language models.
arXiv Detail & Related papers (2024-02-19T05:04:11Z) - Embedded feature selection in LSTM networks with multi-objective
evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks.
Our approach optimize the weights and biases of the LSTM in a partitioned manner.
Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z) - An Adaptive Framework for Generalizing Network Traffic Prediction
towards Uncertain Environments [51.99765487172328]
We have developed a new framework using time-series analysis for dynamically assigning mobile network traffic prediction models.
Our framework employs learned behaviors, outperforming any single model with over a 50% improvement relative to current studies.
arXiv Detail & Related papers (2023-11-30T18:58:38Z) - Lag-Llama: Towards Foundation Models for Probabilistic Time Series
Forecasting [54.04430089029033]
We present Lag-Llama, a general-purpose foundation model for time series forecasting based on a decoder-only transformer architecture.
Lag-Llama is pretrained on a large corpus of diverse time series data from several domains, and demonstrates strong zero-shot generalization capabilities.
When fine-tuned on relatively small fractions of such previously unseen datasets, Lag-Llama achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-12T12:29:32Z) - A Novel Long-term Iterative Mining Scheme for Video Salient Object
Detection [54.53335983750033]
Short-term methodology conflicts with the real mechanism of our visual system.
This paper proposes a novel VSOD approach, which performs VSOD in a complete long-term way.
The proposed approach outperforms almost all SOTA models on five widely used benchmark datasets.
arXiv Detail & Related papers (2022-06-20T04:27:47Z) - Multi-objective Asynchronous Successive Halving [10.632606255280649]
We propose algorithms that extend successive asynchronous halving (ASHA) to the multi-objective (MO) setting.
Our empirical analysis shows that MO ASHA enables to perform MO HPO at scale.
Our algorithms establish new baselines for future research in the area.
arXiv Detail & Related papers (2021-06-23T19:39:31Z) - The Perils of Learning Before Optimizing [16.97597806975415]
We show how prediction models can be learned end-to-end by differentiating through the optimization task.
We show that the performance gap between a two-stage and end-to-end approach is closely related to the emphprice of correlation concept in optimization.
arXiv Detail & Related papers (2021-06-18T20:43:47Z) - Long-Horizon Visual Planning with Goal-Conditioned Hierarchical
Predictors [124.30562402952319]
The ability to predict and plan into the future is fundamental for agents acting in the world.
Current learning approaches for visual prediction and planning fail on long-horizon tasks.
We propose a framework for visual prediction and planning that is able to overcome both of these limitations.
arXiv Detail & Related papers (2020-06-23T17:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.