Data splitting improves statistical performance in overparametrized
regimes
- URL: http://arxiv.org/abs/2110.10956v1
- Date: Thu, 21 Oct 2021 08:10:56 GMT
- Title: Data splitting improves statistical performance in overparametrized
regimes
- Authors: Nicole M\"ucke, Enrico Reiss, Jonas Rungenhagen, and Markus Klein
- Abstract summary: Distributed learning is a common strategy to reduce the overall training time by exploiting multiple computing devices.
We show that in this regime, data splitting has a regularizing effect, hence improving statistical performance and computational complexity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While large training datasets generally offer improvement in model
performance, the training process becomes computationally expensive and time
consuming. Distributed learning is a common strategy to reduce the overall
training time by exploiting multiple computing devices. Recently, it has been
observed in the single machine setting that overparametrization is essential
for benign overfitting in ridgeless regression in Hilbert spaces. We show that
in this regime, data splitting has a regularizing effect, hence improving
statistical performance and computational complexity at the same time. We
further provide a unified framework that allows to analyze both the finite and
infinite dimensional setting. We numerically demonstrate the effect of
different model parameters.
Related papers
- Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Towards Continually Learning Application Performance Models [1.2278517240988065]
Machine learning-based performance models are increasingly being used to build critical job scheduling and application optimization decisions.
Traditionally, these models assume that data distribution does not change as more samples are collected over time.
We develop continually learning performance models that account for the distribution drift, alleviate catastrophic forgetting, and improve generalizability.
arXiv Detail & Related papers (2023-10-25T20:48:46Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - Efficient Augmentation for Imbalanced Deep Learning [8.38844520504124]
We study a convolutional neural network's internal representation of imbalanced image data.
We measure the generalization gap between a model's feature embeddings in the training and test sets, showing that the gap is wider for minority classes.
This insight enables us to design an efficient three-phase CNN training framework for imbalanced data.
arXiv Detail & Related papers (2022-07-13T09:43:17Z) - An Accurate and Efficient Large-scale Regression Method through Best
Friend Clustering [10.273838113763192]
We propose a novel and simple data structure capturing the most important information among data samples.
We combine the clustering with regression techniques as a parallel library and utilize a hybrid structure of data and model parallelism to make predictions.
arXiv Detail & Related papers (2021-04-22T01:34:29Z) - Distributed Learning of Finite Gaussian Mixtures [21.652015112462]
We study split-and-conquer approaches for the distributed learning of finite Gaussian mixtures.
New estimator is shown to be consistent and retains root-n consistency under some general conditions.
Experiments based on simulated and real-world data show that the proposed split-and-conquer approach has comparable statistical performance with the global estimator.
arXiv Detail & Related papers (2020-10-20T16:17:47Z) - Real-Time Regression with Dividing Local Gaussian Processes [62.01822866877782]
Local Gaussian processes are a novel, computationally efficient modeling approach based on Gaussian process regression.
Due to an iterative, data-driven division of the input space, they achieve a sublinear computational complexity in the total number of training points in practice.
A numerical evaluation on real-world data sets shows their advantages over other state-of-the-art methods in terms of accuracy as well as prediction and update speed.
arXiv Detail & Related papers (2020-06-16T18:43:31Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.