Related papers: For interpolating kernel machines, minimizing the norm of the ERM solution minimizes stability

Related papers

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning [1.0282274843007797]
We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties.<n>To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems.
arXiv Detail & Related papers (2025-07-06T05:17:28Z)
Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization [15.334392442475115]
We study first-order algorithms that are uniformly stable for empirical risk minimization problems. We propose a black-box reduction method that turns an optimization algorithm for smooth convex losses into a uniformly stable learning algorithm.
arXiv Detail & Related papers (2024-12-20T14:50:47Z)
Toward Efficient Kernel-Based Solvers for Nonlinear PDEs [19.975293084297014]
This paper introduces a novel kernel learning framework toward efficiently solving nonlinear partial differential equations (PDEs) In contrast to the state-of-the-art kernel solver that embeds differential operators within kernels, our approach eliminates these operators from the kernel. We model the solution using a standard kernel form and differentiate the interpolant to compute the derivatives.
arXiv Detail & Related papers (2024-10-15T01:00:43Z)
Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density. We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order. We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z)
Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints. The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z)
Stability and Generalization for Markov Chain Stochastic Gradient Methods [49.981789906200035]
We provide a comprehensive generalization analysis of MC-SGMs for both minimization and minimax problems. We establish the optimal excess population risk bounds for both smooth and non-smooth cases. We develop the first nearly optimal convergence rates for convex-concave problems both in expectation and with high probability.
arXiv Detail & Related papers (2022-09-16T15:42:51Z)
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima [0.0]
Non-isolated minima presents a unique challenge that has remained under-explored. In this paper, we study the local convergence of the gradient descent method to non-isolated global minima.
arXiv Detail & Related papers (2022-03-21T13:33:37Z)
Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process. We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator. We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z)
Gaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem. We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z)
Towards a Unified Quadrature Framework for Large-Scale Kernel Machines [35.32894170512829]
We develop a quadrature framework for large-scale kernel machines via a numerical integration representation. We leverage fully symmetric interpolatory rules to efficiently compute quadrature nodes and associated weights for kernel approximation.
arXiv Detail & Related papers (2020-11-03T12:48:07Z)
Learning Stabilizing Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR) We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set. We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z)
Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS Algorithm [8.571896191090744]
We show that a Biweight kernel is optimal in the sense of minimizing an mean squared error of a resulting MLR parameter. Secondly, we provide a kernel class for which algorithm iteratively reweighted least-squares algorithm (IRLS) is guaranteed to converge.
arXiv Detail & Related papers (2020-01-30T03:57:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.