Robustly Learning Monotone Single-Index Models
- URL: http://arxiv.org/abs/2508.04670v1
- Date: Wed, 06 Aug 2025 17:37:06 GMT
- Title: Robustly Learning Monotone Single-Index Models
- Authors: Puqian Wang, Nikos Zarifis, Ilias Diakonikolas, Jelena Diakonikolas,
- Abstract summary: We consider the basic problem of learning Single-Index Models with respect to the square loss under the Gaussian distribution.<n>Our main contribution is the first computationally efficient algorithm for this learning task, achieving a constant factor approximation.
- Score: 37.42736399673992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the basic problem of learning Single-Index Models with respect to the square loss under the Gaussian distribution in the presence of adversarial label noise. Our main contribution is the first computationally efficient algorithm for this learning task, achieving a constant factor approximation, that succeeds for the class of {\em all} monotone activations with bounded moment of order $2 + \zeta,$ for $\zeta > 0.$ This class in particular includes all monotone Lipschitz functions and even discontinuous functions like (possibly biased) halfspaces. Prior work for the case of unknown activation either does not attain constant factor approximation or succeeds for a substantially smaller family of activations. The main conceptual novelty of our approach lies in developing an optimization framework that steps outside the boundaries of usual gradient methods and instead identifies a useful vector field to guide the algorithm updates by directly leveraging the problem structure, properties of Gaussian spaces, and regularity of monotone functions.
Related papers
- Robustly Learning Monotone Generalized Linear Models via Data Augmentation [37.42736399673992]
We give the first-time algorithm that achieves a constant-factor approximation for textitany monotone Lipschitz activation.<n>Our work resolves a well-known open problem, by developing a robust counterpart to the classical GLMtron algorithm.
arXiv Detail & Related papers (2025-02-12T17:59:21Z) - Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Inexact subgradient methods for semialgebraic functions [18.293072574300798]
Motivated by the extensive application of approximate gradients in machine learning, we investigate subexact additive methods subject to persistent errors.<n>Our analysis addresses both vanishing and constant step-size regimes.
arXiv Detail & Related papers (2024-04-30T12:47:42Z) - Robustly Learning Single-Index Models via Alignment Sharpness [40.886706402941435]
We study the problem of learning Single-Index Models under the $L2$ loss in the agnostic model.
We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss.
arXiv Detail & Related papers (2024-02-27T18:48:07Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Stochastic Submodular Maximization via Polynomial Estimators [13.498923494159312]
We focus on maximizing submodular functions that are defined as expectations over a class of submodular functions with an unknown distribution.
We show that for monotone functions of this form, greedy continuous algorithm attains an approximation ratio (in expectation) arbitrarily close to $(1-1/e) approx 63%$ using a estimation.
arXiv Detail & Related papers (2023-03-17T13:32:33Z) - Gradient flows and randomised thresholding: sparse inversion and
classification [0.0]
Sparse inversion and classification problems are ubiquitous in modern data science and imaging.
In classification, we consider, e.g., the sum of a data fidelity term and a non-smooth Ginzburg--Landau energy.
Standard (sub)gradient descent methods have shown to be inefficient when approaching such problems.
arXiv Detail & Related papers (2022-03-22T09:21:14Z) - Agnostic Proper Learning of Halfspaces under Gaussian Marginals [56.01192577666607]
We study the problem of agnostically learning halfspaces under the Gaussian.
Our main result is the em first proper learning algorithm for this problem.
arXiv Detail & Related papers (2021-02-10T18:40:44Z) - A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise [55.45544638410858]
We study the problem of PAC learning homogeneous halfspaces in the presence of Tsybakov noise.
Our algorithm learns the true halfspace within any accuracy $epsilon$.
arXiv Detail & Related papers (2020-10-04T22:19:06Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.