Related papers: On Learning Mixture of Linear Regressions in the Non-Realizable Setting

On Learning Mixture of Linear Regressions in the Non-Realizable Setting

URL: http://arxiv.org/abs/2205.13166v1
Date: Thu, 26 May 2022 05:34:57 GMT
Title: On Learning Mixture of Linear Regressions in the Non-Realizable Setting
Authors: Avishek Ghosh, Arya Mazumdar, Soumyabrata Pal and Rajat Sen
Abstract summary: We show that mixture of linear regressions (MLR) can be used for prediction where instead of predicting a label, the model predicts a list of values. In this paper we show that a version of the popular minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed.
Score: 44.307245411703704
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While mixture of linear regressions (MLR) is a well-studied topic, prior works usually do not analyze such models for prediction error. In fact, {\em prediction} and {\em loss} are not well-defined in the context of mixtures. In this paper, first we show that MLR can be used for prediction where instead of predicting a label, the model predicts a list of values (also known as {\em list-decoding}). The list size is equal to the number of components in the mixture, and the loss function is defined to be minimum among the losses resulted by all the component models. We show that with this definition, a solution of the empirical risk minimization (ERM) achieves small probability of prediction error. This begs for an algorithm to minimize the empirical risk for MLR, which is known to be computationally hard. Prior algorithmic works in MLR focus on the {\em realizable} setting, i.e., recovery of parameters when data is probabilistically generated by a mixed linear (noisy) model. In this paper we show that a version of the popular alternating minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed, under some regularity conditions on the dataset and the initial points, and thereby provides a solution for the ERM. We further provide an algorithm that runs in polynomial time in the number of datapoints, and recovers a good approximation of the best fit lines. The two algorithms are experimentally compared.

Related papers

Scalable Subset Selection in Linear Mixed Models [0.39373541926236766]
Linear models (LMMs) are key tools for analyzing heterogeneous data, such as in personalized medicine.<n>Existing methods for LMMs do not scale well beyond tens or hundreds of predictors.<n>New method for LMM subset selection can run on datasets containing thousands predictors in seconds to minutes.
arXiv Detail & Related papers (2025-06-25T13:39:30Z)
Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms [22.79595679373698]
Mixed linear regression is a well-studied problem in statistics and machine learning. In this paper, we consider the more general problem of learning of mixed linear regression from samples. We show that the AM and EM algorithms lead to learning in mixed linear regression by converging to the population loss minimizers.
arXiv Detail & Related papers (2024-06-03T09:43:24Z)
On the Performance of Empirical Risk Minimization with Smoothed Data [59.3428024282545]
Empirical Risk Minimization (ERM) is able to achieve sublinear error whenever a class is learnable with iid data. We show that ERM is able to achieve sublinear error whenever a class is learnable with iid data.
arXiv Detail & Related papers (2024-02-22T21:55:41Z)
Who Should Predict? Exact Algorithms For Learning to Defer to Humans [40.22768241509553]
We show that prior approaches can fail to find a human-AI system with low misclassification error. We give a mixed-integer-linear-programming (MILP) formulation that can optimally solve the problem in the linear setting. We provide a novel surrogate loss function that is realizable-consistent and performs well empirically.
arXiv Detail & Related papers (2023-01-15T21:57:36Z)
Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...). In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task. This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Piecewise linear regression and classification [0.20305676256390928]
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors. A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/bemporad/parc.
arXiv Detail & Related papers (2021-03-10T17:07:57Z)
Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula [30.84155327760468]
This paper proposes a framework for missing value imputation with quantified uncertainty. The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset. Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types.
arXiv Detail & Related papers (2020-06-18T19:51:42Z)
Robust Compressed Sensing using Generative Models [98.64228459705859]
In this paper we propose an algorithm inspired by the Median-of-Means (MOM) Our algorithm guarantees recovery for heavy-tailed data, even in the presence of outliers.
arXiv Detail & Related papers (2020-06-16T19:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.