SHiFT: An Efficient, Flexible Search Engine for Transfer Learning
- URL: http://arxiv.org/abs/2204.01457v1
- Date: Mon, 4 Apr 2022 13:16:46 GMT
- Title: SHiFT: An Efficient, Flexible Search Engine for Transfer Learning
- Authors: Cedric Renggli, Xiaozhe Yao, Luka Kolar, Luka Rimanic, Ana Klimovic,
Ce Zhang
- Abstract summary: Transfer learning can be seen as a data- and compute-efficient alternative to training models from scratch.
We propose SHiFT, the first downstream task-aware, flexible, and efficient model search engine for transfer learning.
- Score: 16.289623977712086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning can be seen as a data- and compute-efficient alternative to
training models from scratch. The emergence of rich model repositories, such as
TensorFlow Hub, enables practitioners and researchers to unleash the potential
of these models across a wide range of downstream tasks. As these repositories
keep growing exponentially, efficiently selecting a good model for the task at
hand becomes paramount. By carefully comparing various selection and search
strategies, we realize that no single method outperforms the others, and hybrid
or mixed strategies can be beneficial. Therefore, we propose SHiFT, the first
downstream task-aware, flexible, and efficient model search engine for transfer
learning. These properties are enabled by a custom query language SHiFT-QL
together with a cost-based decision maker, which we empirically validate.
Motivated by the iterative nature of machine learning development, we further
support efficient incremental executions of our queries, which requires a
careful implementation when jointly used with our optimizations.
Related papers
- Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Green Runner: A tool for efficient deep learning component selection [0.76146285961466]
We present toolname, a novel tool to automatically select and evaluate models based on the application scenario provided in natural language.
toolname features a resource-efficient experimentation engine that integrates constraints and trade-offs based on the problem into the model selection process.
arXiv Detail & Related papers (2024-01-29T00:15:50Z) - Learning to Maximize Mutual Information for Dynamic Feature Selection [13.821253491768168]
We consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information.
We explore a simpler approach of greedily selecting features based on their conditional mutual information.
The proposed method is shown to recover the greedy policy when trained to optimality, and it outperforms numerous existing feature selection methods in our experiments.
arXiv Detail & Related papers (2023-01-02T08:31:56Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Practical Active Learning with Model Selection for Small Data [13.128648437690224]
We develop a simple and fast method for practical active learning with model selection.
Our method is based on an underlying pool-based active learner for binary classification using support vector classification with a radial basis function kernel.
arXiv Detail & Related papers (2021-12-21T23:11:27Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Efficient Data-specific Model Search for Collaborative Filtering [56.60519991956558]
Collaborative filtering (CF) is a fundamental approach for recommender systems.
In this paper, motivated by the recent advances in automated machine learning (AutoML), we propose to design a data-specific CF model.
Key here is a new framework that unifies state-of-the-art (SOTA) CF methods and splits them into disjoint stages of input encoding, embedding function, interaction and prediction function.
arXiv Detail & Related papers (2021-06-14T14:30:32Z) - Learning Discrete Energy-based Models via Auxiliary-variable Local
Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data.
We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration.
We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z) - Which Model to Transfer? Finding the Needle in the Growing Haystack [27.660318887140203]
We provide a formalization of this problem through a familiar notion of regret.
We show that both task-agnostic and task-aware methods can yield high regret.
We then propose a simple and efficient hybrid search strategy which outperforms the existing approaches.
arXiv Detail & Related papers (2020-10-13T14:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.