Mixed Semi-Supervised Generalized-Linear-Regression with applications to Deep-Learning and Interpolators
- URL: http://arxiv.org/abs/2302.09526v3
- Date: Tue, 28 May 2024 13:45:48 GMT
- Title: Mixed Semi-Supervised Generalized-Linear-Regression with applications to Deep-Learning and Interpolators
- Authors: Oren Yuval, Saharon Rosset,
- Abstract summary: We present a methodology for using unlabeled data to design semi supervised learning (SSL) methods.
We include in each of them a mixing parameter $alpha$, controlling the weight given to the unlabeled data.
We demonstrate the effectiveness of our methodology in delivering substantial improvement compared to the standard supervised models.
- Score: 6.537685198688539
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a methodology for using unlabeled data to design semi supervised learning (SSL) methods that improve the prediction performance of supervised learning for regression tasks. The main idea is to design different mechanisms for integrating the unlabeled data, and include in each of them a mixing parameter $\alpha$, controlling the weight given to the unlabeled data. Focusing on Generalized Linear Models (GLM) and linear interpolators classes of models, we analyze the characteristics of different mixing mechanisms, and prove that in all cases, it is invariably beneficial to integrate the unlabeled data with some nonzero mixing ratio $\alpha>0$, in terms of predictive performance. Moreover, we provide a rigorous framework to estimate the best mixing ratio $\alpha^*$ where mixed SSL delivers the best predictive performance, while using the labeled and unlabeled data on hand. The effectiveness of our methodology in delivering substantial improvement compared to the standard supervised models, in a variety of settings, is demonstrated empirically through extensive simulation, in a manner that supports the theoretical analysis. We also demonstrate the applicability of our methodology (with some intuitive modifications) to improve more complex models, such as deep neural networks, in real-world regression tasks.
Related papers
- Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Using Explainable Boosting Machine to Compare Idiographic and Nomothetic
Approaches for Ecological Momentary Assessment Data [2.0824228840987447]
This paper explores the use of non-linear interpretable machine learning (ML) models in classification problems.
Various ensembles of trees are compared to linear models using imbalanced synthetic and real-world datasets.
In one of the two real-world datasets, knowledge distillation method achieves improved AUC scores.
arXiv Detail & Related papers (2022-04-04T17:56:37Z) - Learning to Refit for Convex Learning Problems [11.464758257681197]
We propose a framework to learn to estimate optimized model parameters for different training sets using neural networks.
We rigorously characterize the power of neural networks to approximate convex problems.
arXiv Detail & Related papers (2021-11-24T15:28:50Z) - Nonparametric Functional Analysis of Generalized Linear Models Under
Nonlinear Constraints [0.0]
This article introduces a novel nonparametric methodology for Generalized Linear Models.
It combines the strengths of the binary regression and latent variable formulations for categorical data.
It extends recently published parametric versions of the methodology and generalizes it.
arXiv Detail & Related papers (2021-10-11T04:49:59Z) - A Variational Infinite Mixture for Probabilistic Inverse Dynamics
Learning [34.90240171916858]
We develop an efficient variational Bayes inference technique for infinite mixtures of probabilistic local models.
We highlight the model's power in combining data-driven adaptation, fast prediction and the ability to deal with discontinuous functions and heteroscedastic noise.
We use the learned models for online dynamics control of a Barrett-WAM manipulator, significantly improving the trajectory tracking performance.
arXiv Detail & Related papers (2020-11-10T16:15:13Z) - Semi-Supervised Empirical Risk Minimization: Using unlabeled data to
improve prediction [4.860671253873579]
We present a general methodology for using unlabeled data to design semi supervised learning (SSL) variants of the Empirical Risk Minimization (ERM) learning process.
We analyze of the effectiveness of our SSL approach in improving prediction performance.
arXiv Detail & Related papers (2020-09-01T17:55:51Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.