Accounting for Variance in Machine Learning Benchmarks
- URL: http://arxiv.org/abs/2103.03098v1
- Date: Mon, 1 Mar 2021 22:39:49 GMT
- Title: Accounting for Variance in Machine Learning Benchmarks
- Authors: Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov,
Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan,
Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal
Arbel, Chris Pal, Ga\"el Varoquaux and Pascal Vincent
- Abstract summary: One machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation.
This is prohibitively expensive, and corners are cut to reach conclusions.
We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyper parameter choice impact markedly the results.
We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost.
- Score: 37.922783300635864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Strong empirical evidence that one machine-learning algorithm A outperforms
another one B ideally calls for multiple trials optimizing the learning
pipeline over sources of variation such as data sampling, data augmentation,
parameter initialization, and hyperparameters choices. This is prohibitively
expensive, and corners are cut to reach conclusions. We model the whole
benchmarking process, revealing that variance due to data sampling, parameter
initialization and hyperparameter choice impact markedly the results. We
analyze the predominant comparison methods used today in the light of this
variance. We show a counter-intuitive result that adding more sources of
variation to an imperfect estimator approaches better the ideal estimator at a
51 times reduction in compute cost. Building on these results, we study the
error rate of detecting improvements, on five different deep-learning
tasks/architectures. This study leads us to propose recommendations for
performance comparisons.
Related papers
- Be aware of overfitting by hyperparameter optimization! [0.0]
We show that hyperparameter optimization did not always result in better models, possibly due to overfitting when using the same statistical measures.
We also extended the previous analysis by adding a representation learning method based on Natural Language Processing of smiles called Transformer CNN.
We show that across all analyzed sets using exactly the same protocol, Transformer CNN provided better results than graph-based methods for 26 out of 28 pairwise comparisons.
arXiv Detail & Related papers (2024-07-30T12:45:05Z) - Target Variable Engineering [0.0]
We compare the predictive performance of regression models trained to predict numeric targets vs. classifiers trained to predict their binarized counterparts.
We find that regression requires significantly more computational effort to converge upon the optimal performance.
arXiv Detail & Related papers (2023-10-13T23:12:21Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP)
Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN.
It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Doubly Robust Semiparametric Difference-in-Differences Estimators with
High-Dimensional Data [15.27393561231633]
We propose a doubly robust two-stage semiparametric difference-in-difference estimator for estimating heterogeneous treatment effects.
The first stage allows a general set of machine learning methods to be used to estimate the propensity score.
In the second stage, we derive the rates of convergence for both the parametric parameter and the unknown function.
arXiv Detail & Related papers (2020-09-07T15:14:29Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z) - The Right Tool for the Job: Matching Model and Instance Complexities [62.95183777679024]
As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs.
We propose a modification to contextual representation fine-tuning which, during inference, allows for an early (and fast) "exit"
We test our proposed modification on five different datasets in two tasks: three text classification datasets and two natural language inference benchmarks.
arXiv Detail & Related papers (2020-04-16T04:28:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.