Max-Linear Regression by Convex Programming
- URL: http://arxiv.org/abs/2103.07020v2
- Date: Sat, 24 Feb 2024 01:28:20 GMT
- Title: Max-Linear Regression by Convex Programming
- Authors: Seonho Kim, Sohail Bahmani, and Kiryung Lee
- Abstract summary: We formulate and analyze a scalable convex program given by anchored regression (AR) as the estimator for the max-linear regression problem.
Our result shows a sufficient number of noise-free observations for exact recovery scales as $k4p$ up to a logarithmic factor.
- Score: 5.366354612549172
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the multivariate max-linear regression problem where the model
parameters
$\boldsymbol{\beta}_{1},\dotsc,\boldsymbol{\beta}_{k}\in\mathbb{R}^{p}$ need to
be estimated from $n$ independent samples of the (noisy) observations $y =
\max_{1\leq j \leq k} \boldsymbol{\beta}_{j}^{\mathsf{T}} \boldsymbol{x} +
\mathrm{noise}$. The max-linear model vastly generalizes the conventional
linear model, and it can approximate any convex function to an arbitrary
accuracy when the number of linear models $k$ is large enough. However, the
inherent nonlinearity of the max-linear model renders the estimation of the
regression parameters computationally challenging. Particularly, no estimator
based on convex programming is known in the literature. We formulate and
analyze a scalable convex program given by anchored regression (AR) as the
estimator for the max-linear regression problem. Under the standard Gaussian
observation setting, we present a non-asymptotic performance guarantee showing
that the convex program recovers the parameters with high probability. When the
$k$ linear components are equally likely to achieve the maximum, our result
shows a sufficient number of noise-free observations for exact recovery scales
as {$k^{4}p$} up to a logarithmic factor. { This sample complexity coincides
with that by alternating minimization (Ghosh et al., {2021}). Moreover, the
same sample complexity applies when the observations are corrupted with
arbitrary deterministic noise. We provide empirical results that show that our
method performs as our theoretical result predicts, and is competitive with the
alternating minimization algorithm particularly in presence of multiplicative
Bernoulli noise. Furthermore, we also show empirically that a recursive
application of AR can significantly improve the estimation accuracy.}
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Max-affine regression via first-order methods [7.12511675782289]
The max-affine model ubiquitously arises in applications in signal processing and statistics.
We present a non-asymptotic convergence analysis of gradient descent (GD) and mini-batch gradient descent (SGD) for max-affine regression.
arXiv Detail & Related papers (2023-08-15T23:46:44Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Mixed Regression via Approximate Message Passing [16.91276351457051]
We study the problem of regression in a generalized linear model (GLM) with multiple signals and latent variables.
In mixed linear regression, each observation comes from one of $L$ signal vectors (regressors), but we do not know which one.
In max-affine regression, each observation comes from the maximum of $L$ affine functions, each defined via a different signal vector.
arXiv Detail & Related papers (2023-04-05T04:59:59Z) - Optimal Online Generalized Linear Regression with Stochastic Noise and
Its Application to Heteroscedastic Bandits [88.6139446295537]
We study the problem of online generalized linear regression in the setting of a generalized linear model with possibly unbounded additive noise.
We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise.
We propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
arXiv Detail & Related papers (2022-02-28T08:25:26Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - A Precise Performance Analysis of Support Vector Regression [105.94855998235232]
We study the hard and soft support vector regression techniques applied to a set of $n$ linear measurements.
Our results are then used to optimally tune the parameters intervening in the design of hard and soft support vector regression algorithms.
arXiv Detail & Related papers (2021-05-21T14:26:28Z) - Robust High Dimensional Expectation Maximization Algorithm via Trimmed
Hard Thresholding [24.184520829631587]
We study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space.
We propose a method called Trimmed (Gradient) Expectation Maximization which adds a trimming gradient step.
We show that the algorithm is corruption-proofing and converges to the (near) optimal statistical rate geometrically.
arXiv Detail & Related papers (2020-10-19T15:00:35Z) - Estimating Stochastic Linear Combination of Non-linear Regressions
Efficiently and Scalably [23.372021234032363]
We show that when the sub-sample sizes are large then the estimation errors will be sacrificed by too much.
To the best of our knowledge, this is the first work that and guarantees for the lineartext+Stochasticity model.
arXiv Detail & Related papers (2020-10-19T07:15:38Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.