Related papers: Convex Regression with a Penalty

Convex Regression with a Penalty

URL: http://arxiv.org/abs/2509.19788v1
Date: Wed, 24 Sep 2025 06:19:21 GMT
Title: Convex Regression with a Penalty
Authors: Eunji Lim,
Abstract summary: A common way to estimate an unknown convex regression function $f_0: Omega subset mathbbRd rightarrow mathbbR$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors.<n>We introduce a new estimator of $f_0$ that avoids this overfitting by minimizing a penalty on the subgradient while enforcing an upper bound $s_n$ on the sum of squared errors.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: A common way to estimate an unknown convex regression function $f_0: \Omega \subset \mathbb{R}^d \rightarrow \mathbb{R}$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors. However, this estimator is known for its tendency to overfit near the boundary of $\Omega$, posing significant challenges in real-world applications. In this paper, we introduce a new estimator of $f_0$ that avoids this overfitting by minimizing a penalty on the subgradient while enforcing an upper bound $s_n$ on the sum of squared errors. The key advantage of this method is that $s_n$ can be directly estimated from the data. We establish the uniform almost sure consistency of the proposed estimator and its subgradient over $\Omega$ as $n \rightarrow \infty$ and derive convergence rates. The effectiveness of our estimator is illustrated through its application to estimating waiting times in a single-server queue.

Related papers

Information-Computation Tradeoffs for Noiseless Linear Regression with Oblivious Contamination [65.37519531362157]
We show that any efficient Statistical Query algorithm for this task requires VSTAT complexity at least $tildeOmega(d1/2/alpha2)$.
arXiv Detail & Related papers (2025-10-12T15:42:44Z)
Outlier-robust Mean Estimation near the Breakdown Point via Sum-of-Squares [4.335413713700667]
We give a new analysis of the canonical sum-of-squares program introduced in citekothari2018robust. We show that this program efficiently achieves optimal error rate for all $varepsilon in[0,frac12)$.
arXiv Detail & Related papers (2024-11-21T16:57:05Z)
Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism [1.4999444543328293]
We propose a model-based algorithm based on novel cost and reward function estimators. Our algorithm achieves a regret upper bound of $widetildemathcalO((bar C - bar C_b)-1H2.5 SsqrtAK)$ where $bar C$ is the cost budget for an episode.
arXiv Detail & Related papers (2024-10-14T04:51:06Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
Robust Nonparametric Regression under Poisoning Attack [13.470899588917716]
An adversarial attacker can modify the values of up to $q$ samples from a training dataset of size $N$. Our initial solution is an M-estimator based on Huber loss minimization. The final estimate is nearly minimax optimal for arbitrary $q$, up to a $ln N$ factor.
arXiv Detail & Related papers (2023-05-26T09:33:17Z)
Estimating the minimizer and the minimum value of a regression function under passive design [72.85024381807466]
We propose a new method for estimating the minimizer $boldsymbolx*$ and the minimum value $f*$ of a smooth and strongly convex regression function $f$. We derive non-asymptotic upper bounds for the quadratic risk and optimization error of $boldsymbolz_n$, and for the risk of estimating $f*$.
arXiv Detail & Related papers (2022-11-29T18:38:40Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions. We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$. We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z)
AI without networks [0.0]
We develop a network-free framework for AI incorporating generative modeling. We demonstrate this framework with examples from three different disciplines - ethology, control theory, and mathematics. We also propose an easily computed method of credit assignment based on this framework, to help address ethical-legal challenges raised by generative AI.
arXiv Detail & Related papers (2021-06-07T05:50:02Z)
Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist. We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z)
Out-of-sample error estimate for robust M-estimators with convex penalty [5.33024001730262]
A generic out-of-sample error estimate is proposed for robust $M$-estimators regularized with a convex penalty. General differentiable loss functions $psi$ are allowed provided that $psi=rho'$ is 1-Lipschitz.
arXiv Detail & Related papers (2020-08-26T21:50:41Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.