Convex Regression with a Penalty
- URL: http://arxiv.org/abs/2509.19788v1
- Date: Wed, 24 Sep 2025 06:19:21 GMT
- Title: Convex Regression with a Penalty
- Authors: Eunji Lim,
- Abstract summary: A common way to estimate an unknown convex regression function $f_0: Omega subset mathbbRd rightarrow mathbbR$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors.<n>We introduce a new estimator of $f_0$ that avoids this overfitting by minimizing a penalty on the subgradient while enforcing an upper bound $s_n$ on the sum of squared errors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A common way to estimate an unknown convex regression function $f_0: \Omega \subset \mathbb{R}^d \rightarrow \mathbb{R}$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors. However, this estimator is known for its tendency to overfit near the boundary of $\Omega$, posing significant challenges in real-world applications. In this paper, we introduce a new estimator of $f_0$ that avoids this overfitting by minimizing a penalty on the subgradient while enforcing an upper bound $s_n$ on the sum of squared errors. The key advantage of this method is that $s_n$ can be directly estimated from the data. We establish the uniform almost sure consistency of the proposed estimator and its subgradient over $\Omega$ as $n \rightarrow \infty$ and derive convergence rates. The effectiveness of our estimator is illustrated through its application to estimating waiting times in a single-server queue.
Related papers
- Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination [65.37519531362157]
We show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $tildeOmega(d1/2/alpha2)$.
arXiv Detail & Related papers (2025-10-12T15:42:44Z) - Outlier-robust Mean Estimation near the Breakdown Point via Sum-of-Squares [4.335413713700667]
We give a new analysis of the canonical sum-of-squares program introduced in citekothari2018robust.
We show that this program efficiently achieves optimal error rate for all $varepsilon in[0,frac12)$.
arXiv Detail & Related papers (2024-11-21T16:57:05Z) - Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism [1.4999444543328293]
We propose a model-based algorithm based on novel cost and reward function estimators.
Our algorithm achieves a regret upper bound of $widetildemathcalO((bar C - bar C_b)-1H2.5 SsqrtAK)$ where $bar C$ is the cost budget for an episode.
arXiv Detail & Related papers (2024-10-14T04:51:06Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Robust Nonparametric Regression under Poisoning Attack [13.470899588917716]
An adversarial attacker can modify the values of up to $q$ samples from a training dataset of size $N$.
Our initial solution is an M-estimator based on Huber loss minimization.
The final estimate is nearly minimax optimal for arbitrary $q$, up to a $ln N$ factor.
arXiv Detail & Related papers (2023-05-26T09:33:17Z) - Estimating the minimizer and the minimum value of a regression function
under passive design [72.85024381807466]
We propose a new method for estimating the minimizer $boldsymbolx*$ and the minimum value $f*$ of a smooth and strongly convex regression function $f$.
We derive non-asymptotic upper bounds for the quadratic risk and optimization error of $boldsymbolz_n$, and for the risk of estimating $f*$.
arXiv Detail & Related papers (2022-11-29T18:38:40Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - AI without networks [0.0]
We develop a network-free framework for AI incorporating generative modeling.
We demonstrate this framework with examples from three different disciplines - ethology, control theory, and mathematics.
We also propose an easily computed method of credit assignment based on this framework, to help address ethical-legal challenges raised by generative AI.
arXiv Detail & Related papers (2021-06-07T05:50:02Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Out-of-sample error estimate for robust M-estimators with convex penalty [5.33024001730262]
A generic out-of-sample error estimate is proposed for robust $M$-estimators regularized with a convex penalty.
General differentiable loss functions $psi$ are allowed provided that $psi=rho'$ is 1-Lipschitz.
arXiv Detail & Related papers (2020-08-26T21:50:41Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.