Related papers: Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection

Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection

URL: http://arxiv.org/abs/2406.07536v1
Date: Tue, 11 Jun 2024 17:57:49 GMT
Title: Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection
Authors: Wenxiao Wang, Weiming Zhuang, Lingjuan Lyu,
Abstract summary: An ideal model selection scheme should support two operations efficiently over a large pool of candidate models. Previous solutions to model selection require high computational complexity for at least one of these two operations. We present Standardized Embedder, an empirical realization of isolated model embedding.
Score: 40.85209520973634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advancement of deep learning technologies is bringing new models every day, motivating the study of scalable model selection. An ideal model selection scheme should minimally support two operations efficiently over a large pool of candidate models: update, which involves either adding a new candidate model or removing an existing candidate model, and selection, which involves locating highly performing models for a given task. However, previous solutions to model selection require high computational complexity for at least one of these two operations. In this work, we target fundamentally (more) scalable model selection that supports asymptotically fast update and asymptotically fast selection at the same time. Firstly, we define isolated model embedding, a family of model selection schemes supporting asymptotically fast update and selection: With respect to the number of candidate models $m$, the update complexity is O(1) and the selection consists of a single sweep over $m$ vectors in addition to O(1) model operations. Isolated model embedding also implies several desirable properties for applications. Secondly, we present Standardized Embedder, an empirical realization of isolated model embedding. We assess its effectiveness by using it to select representations from a pool of 100 pre-trained vision models for classification tasks and measuring the performance gaps between the selected models and the best candidates with a linear probing protocol. Experiments suggest our realization is effective in selecting models with competitive performances and highlight isolated model embedding as a promising direction towards model selection that is fundamentally (more) scalable.

Related papers

Consensus-Driven Active Model Selection [29.150990754584978]
We propose a method for active model selection using predictions from candidate models to prioritize the labeling of test data points.<n>Our method, CODA, performs consensus-driven active model selection by modeling relationships between categories, and data points.<n>We validate our approach by curating a collection of 26 benchmark tasks capturing a range of model selection scenarios.
arXiv Detail & Related papers (2025-07-31T17:56:28Z)
Stabilizing black-box model selection with the inflated argmax [8.52745154080651]
This paper presents a new approach to stabilizing model selection that leverages a combination of bagging and an "inflated" argmax operation. Our method selects a small collection of models that all fit the data, and it is stable in that, with high probability, the removal of any training point will result in a collection of selected models that overlaps with the original collection. In both settings, the proposed method yields stable and compact collections of selected models, outperforming a variety of benchmarks.
arXiv Detail & Related papers (2024-10-23T20:39:07Z)
All models are wrong, some are useful: Model Selection with Limited Labels [49.62984196182567]
We introduce MODEL SELECTOR, a framework for label-efficient selection of pretrained classifiers. We show that MODEL SELECTOR drastically reduces the need for labeled data while consistently picking the best or near-best performing model. Our results further highlight the robustness of MODEL SELECTOR in model selection, as it reduces the labeling cost by up to 72.41% when selecting a near-best model.
arXiv Detail & Related papers (2024-10-17T14:45:56Z)
Enabling Small Models for Zero-Shot Classification through Model Label Learning [50.68074833512999]
We introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap between models and their functionalities. Experiments on seven real-world datasets validate the effectiveness and efficiency of MLL.
arXiv Detail & Related papers (2024-08-21T09:08:26Z)
A Two-Phase Recall-and-Select Framework for Fast Model Selection [13.385915962994806]
We propose a two-phase (coarse-recall and fine-selection) model selection framework. It aims to enhance the efficiency of selecting a robust model by leveraging the models' training performances on benchmark datasets. It has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods.
arXiv Detail & Related papers (2024-03-28T14:44:44Z)
REFRESH: Responsible and Efficient Feature Reselection Guided by SHAP Values [17.489279048199304]
REFRESH is a method to reselect features so that additional constraints that are desirable towards model performance can be achieved without having to train several new models. REFRESH's underlying algorithm is a novel technique using SHAP values and correlation analysis that can approximate for the predictions of a model without having to train these models.
arXiv Detail & Related papers (2024-03-13T18:06:43Z)
Budgeted Online Model Selection and Fine-Tuning via Federated Learning [26.823435733330705]
Online model selection involves selecting a model from a set of candidate models 'on the fly' to perform prediction on a stream of data. The choice of candidate models henceforth has a crucial impact on the performance. The present paper proposes an online federated model selection framework where a group of learners (clients) interacts with a server with sufficient memory. Using the proposed algorithm, clients and the server collaborate to fine-tune models to adapt them to a non-stationary environment.
arXiv Detail & Related papers (2024-01-19T04:02:49Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z)
A linearized framework and a new benchmark for model selection for fine-tuning [112.20527122513668]
Fine-tuning from a collection of models pre-trained on different domains is emerging as a technique to improve test accuracy in the low-data regime. We introduce two new baselines for model selection -- Label-Gradient and Label-Feature Correlation. Our benchmark highlights accuracy gain with model zoo compared to fine-tuning Imagenet models.
arXiv Detail & Related papers (2021-01-29T21:57:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.