Fast, Accurate, and Simple Models for Tabular Data via Augmented
Distillation
- URL: http://arxiv.org/abs/2006.14284v1
- Date: Thu, 25 Jun 2020 09:57:47 GMT
- Title: Fast, Accurate, and Simple Models for Tabular Data via Augmented
Distillation
- Authors: Rasool Fakoor, Jonas Mueller, Nick Erickson, Pratik Chaudhari,
Alexander J. Smola
- Abstract summary: We propose FAST-DAD to distill arbitrarily complex ensemble predictors into individual models like boosted trees, random forests, and deep networks.
Our individual distilled models are over 10x faster and more accurate than ensemble predictors produced by AutoML tools like H2O/AutoSklearn.
- Score: 97.42894942391575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated machine learning (AutoML) can produce complex model ensembles by
stacking, bagging, and boosting many individual models like trees, deep
networks, and nearest neighbor estimators. While highly accurate, the resulting
predictors are large, slow, and opaque as compared to their constituents. To
improve the deployment of AutoML on tabular data, we propose FAST-DAD to
distill arbitrarily complex ensemble predictors into individual models like
boosted trees, random forests, and deep networks. At the heart of our approach
is a data augmentation strategy based on Gibbs sampling from a self-attention
pseudolikelihood estimator. Across 30 datasets spanning regression and
binary/multiclass classification tasks, FAST-DAD distillation produces
significantly better individual models than one obtains through standard
training on the original data. Our individual distilled models are over 10x
faster and more accurate than ensemble predictors produced by AutoML tools like
H2O/AutoSklearn.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning [47.02160072880698]
We introduce a self-evolving mechanism that allows the model itself to actively sample subsets that are equally or even more effective.
The key to our data sampling technique lies in the enhancement of diversity in the chosen subsets.
Extensive experiments across three datasets and benchmarks demonstrate the effectiveness of DiverseEvol.
arXiv Detail & Related papers (2023-11-14T14:10:40Z) - Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How [62.467716468917224]
We propose a methodology that jointly searches for the optimal pretrained model and the hyperparameters for finetuning it.
Our method transfers knowledge about the performance of many pretrained models on a series of datasets.
We empirically demonstrate that our resulting approach can quickly select an accurate pretrained model for a new dataset.
arXiv Detail & Related papers (2023-06-06T16:15:26Z) - MILO: Model-Agnostic Subset Selection Framework for Efficient Model
Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training.
Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - A Deep Neural Networks ensemble workflow from hyperparameter search to
inference leveraging GPU clusters [0.0]
AutoML seeks to automatically build ensembles of Deep Neural Networks (DNNs) to achieve qualitative predictions.
We propose a new AutoML to build a larger library of accurate and diverse individual models to then construct ensembles.
New ensemble selection method based on a multi-objective greedy algorithm is proposed to generate accurate ensembles.
arXiv Detail & Related papers (2022-08-30T08:04:19Z) - Do We Really Need Deep Learning Models for Time Series Forecasting? [4.2698418800007865]
Time series forecasting is a crucial task in machine learning, as it has a wide range of applications.
Deep learning and matrix factorization models have been recently proposed to tackle the same problem with more competitive performance.
In this paper, we try to answer whether these highly complex deep learning models are without alternative.
arXiv Detail & Related papers (2021-01-06T16:18:04Z) - AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with
Autotuned Data-Parallel Training for Tabular Data [11.552769149674544]
Development of high-performing predictive models for large data sets is a challenging task.
Recent automated machine learning (AutoML) is emerging as a promising approach to automate predictive model development.
We have developed AgEBO-Tabular, an approach to combine aging evolution (AgE) and a parallel NAS method that searches over neural architecture space.
arXiv Detail & Related papers (2020-10-30T16:28:48Z) - It's the Best Only When It Fits You Most: Finding Related Models for
Serving Based on Dynamic Locality Sensitive Hashing [1.581913948762905]
Preparation of training data is often a bottleneck in the lifecycle of deploying a deep learning model for production or research.
This paper proposes an end-to-end process of searching related models for serving based on the similarity of the target dataset and the training datasets of the available models.
arXiv Detail & Related papers (2020-10-13T22:52:13Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.