On Learning Mixture of Linear Regressions in the Non-Realizable Setting
- URL: http://arxiv.org/abs/2205.13166v1
- Date: Thu, 26 May 2022 05:34:57 GMT
- Title: On Learning Mixture of Linear Regressions in the Non-Realizable Setting
- Authors: Avishek Ghosh, Arya Mazumdar, Soumyabrata Pal and Rajat Sen
- Abstract summary: We show that mixture of linear regressions (MLR) can be used for prediction where instead of predicting a label, the model predicts a list of values.
In this paper we show that a version of the popular minimization (AM) algorithm finds the best fit lines in a dataset even when a realizable model is not assumed.
- Score: 44.307245411703704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While mixture of linear regressions (MLR) is a well-studied topic, prior
works usually do not analyze such models for prediction error. In fact, {\em
prediction} and {\em loss} are not well-defined in the context of mixtures. In
this paper, first we show that MLR can be used for prediction where instead of
predicting a label, the model predicts a list of values (also known as {\em
list-decoding}). The list size is equal to the number of components in the
mixture, and the loss function is defined to be minimum among the losses
resulted by all the component models. We show that with this definition, a
solution of the empirical risk minimization (ERM) achieves small probability of
prediction error. This begs for an algorithm to minimize the empirical risk for
MLR, which is known to be computationally hard. Prior algorithmic works in MLR
focus on the {\em realizable} setting, i.e., recovery of parameters when data
is probabilistically generated by a mixed linear (noisy) model. In this paper
we show that a version of the popular alternating minimization (AM) algorithm
finds the best fit lines in a dataset even when a realizable model is not
assumed, under some regularity conditions on the dataset and the initial
points, and thereby provides a solution for the ERM. We further provide an
algorithm that runs in polynomial time in the number of datapoints, and
recovers a good approximation of the best fit lines. The two algorithms are
experimentally compared.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest.
Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z) - Agnostic Learning of Mixed Linear Regressions with EM and AM Algorithms [22.79595679373698]
Mixed linear regression is a well-studied problem in statistics and machine learning.
In this paper, we consider the more general problem of learning of mixed linear regression from samples.
We show that the AM and EM algorithms lead to learning in mixed linear regression by converging to the population loss minimizers.
arXiv Detail & Related papers (2024-06-03T09:43:24Z) - On the Performance of Empirical Risk Minimization with Smoothed Data [59.3428024282545]
Empirical Risk Minimization (ERM) is able to achieve sublinear error whenever a class is learnable with iid data.
We show that ERM is able to achieve sublinear error whenever a class is learnable with iid data.
arXiv Detail & Related papers (2024-02-22T21:55:41Z) - Who Should Predict? Exact Algorithms For Learning to Defer to Humans [40.22768241509553]
We show that prior approaches can fail to find a human-AI system with low misclassification error.
We give a mixed-integer-linear-programming (MILP) formulation that can optimally solve the problem in the linear setting.
We provide a novel surrogate loss function that is realizable-consistent and performs well empirically.
arXiv Detail & Related papers (2023-01-15T21:57:36Z) - Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task.
This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Piecewise linear regression and classification [0.20305676256390928]
This paper proposes a method for solving multivariate regression and classification problems using piecewise linear predictors.
A Python implementation of the algorithm described in this paper is available at http://cse.lab.imtlucca.it/bemporad/parc.
arXiv Detail & Related papers (2021-03-10T17:07:57Z) - Matrix Completion with Quantified Uncertainty through Low Rank Gaussian
Copula [30.84155327760468]
This paper proposes a framework for missing value imputation with quantified uncertainty.
The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset.
Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types.
arXiv Detail & Related papers (2020-06-18T19:51:42Z) - Robust Compressed Sensing using Generative Models [98.64228459705859]
In this paper we propose an algorithm inspired by the Median-of-Means (MOM)
Our algorithm guarantees recovery for heavy-tailed data, even in the presence of outliers.
arXiv Detail & Related papers (2020-06-16T19:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.