Related papers: Learning and Testing Convex Functions

Learning and Testing Convex Functions

URL: http://arxiv.org/abs/2511.11498v1
Date: Fri, 14 Nov 2025 17:19:44 GMT
Title: Learning and Testing Convex Functions
Authors: Renato Ferreira Pinto, Cassandra Marcussen, Elchanan Mossel, Shivam Nadimpalli,
Abstract summary: We consider the problems of emphlearning and emphtesting real-valued convex functions over Gaussian space.<n>Despite the extensive study of function convexity across mathematics, its learnability and testability have largely been examined only in discrete or restricted settings.
Score: 18.95992615547965
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We consider the problems of \emph{learning} and \emph{testing} real-valued convex functions over Gaussian space. Despite the extensive study of function convexity across mathematics, statistics, and computer science, its learnability and testability have largely been examined only in discrete or restricted settings -- typically with respect to the Hamming distance, which is ill-suited for real-valued functions. In contrast, we study these problems in high dimensions under the standard Gaussian measure, assuming sample access to the function and a mild smoothness condition, namely Lipschitzness. A smoothness assumption is natural and, in fact, necessary even in one dimension: without it, convexity cannot be inferred from finitely many samples. As our main results, we give: - Learning Convex Functions: An agnostic proper learning algorithm for Lipschitz convex functions that achieves error $\varepsilon$ using $n^{O(1/\varepsilon^2)}$ samples, together with a complementary lower bound of $n^{\mathrm{poly}(1/\varepsilon)}$ samples in the \emph{correlational statistical query (CSQ)} model. - Testing Convex Functions: A tolerant (two-sided) tester for convexity of Lipschitz functions with the same sample complexity (as a corollary of our learning result), and a one-sided tester (which never rejects convex functions) using $O(\sqrt{n}/\varepsilon)^n$ samples.

Related papers

Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models [37.42736399673992]
A single-index model (SIM) is a function of the form $sigma(mathbfwast cdot mathbfx)$, where $sigma: mathbbR to mathbbR$ is a known link function and $mathbfwast$ is a hidden unit vector. We show that a proper learner attains $L2$-error of $O(mathrmOPT)+epsilon$, where $
arXiv Detail & Related papers (2024-11-08T17:10:38Z)
Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions [41.43895948769255]
We show a class of non-smooth non-asymptotic fairness problems in the form of $min_x[yin Yphi(x, y) - max_zin Zpsix(x, z)]$. We propose an envelope approximate gradient SMAG, the first method for solving these problems, provide a state-of-the-art non-asymptotic convergence rate.
arXiv Detail & Related papers (2024-05-28T20:52:46Z)
Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution. We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z)
The Sample Complexity Of ERMs In Stochastic Convex Optimization [13.896417716930687]
We show that in fact $tildeO(fracdepsilon+frac1epsilon2)$ data points are also sufficient. We further generalize the result and show that a similar upper bound holds for all convex bodies.
arXiv Detail & Related papers (2023-11-09T14:29:25Z)
Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$. Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions. We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$. We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z)
The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$. In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples. We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z)
Optimal Testing of Discrete Distributions with High Probability [49.19942805582874]
We study the problem of testing discrete distributions with a focus on the high probability regime. We provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors.
arXiv Detail & Related papers (2020-09-14T16:09:17Z)
Downsampling for Testing and Learning in Product Distributions [24.48103093661132]
We study distribution-free property testing and learning problems where the unknown probability distribution is a product distribution over $mathbbRd$ For many important classes of functions, such as intersections of halfspaces, threshold functions, convex sets, and $k$-alternating functions, known algorithms either have complexity that depends on the support size of the distribution. We introduce a general method, which we call downlog, that resolves these issues.
arXiv Detail & Related papers (2020-07-15T02:46:44Z)
Convex Regression in Multidimensions: Suboptimality of Least Squares Estimators [4.758195657080579]
Least Squares Estimators are shown to be suboptimal for estimating a $d$-dimensional convex function in squared error loss when the dimension $d$ is 5 or larger. The specific function classes considered include: (i) bounded convex functions supported on a polytope (in random design), (ii) Lipschitz convex functions supported on any convex domain (in random design), and (iii) convex functions supported on a polytope (in fixed design)
arXiv Detail & Related papers (2020-06-03T04:57:05Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.