Related papers: Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality

Related papers

Stochastic interior-point methods for smooth conic optimization with applications [3.294420397461204]
We introduce an interior-point method for general conic optimization, along with four novel SIPM variants. Under underdeveloped assumptions, we establish the global convergence rates of our proposed SIPMs. Experiments on robust linear regression, multi-task relationship learning, and clustering data streams demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-12-17T15:06:44Z)
Asymptotically Optimal Change Detection for Unnormalized Pre- and Post-Change Distributions [65.38208224389027]
This paper addresses the problem of detecting changes when only unnormalized pre- and post-change distributions are accessible. Our approach is based on the estimation of the Cumulative Sum statistics, which is known to produce optimal performance.
arXiv Detail & Related papers (2024-10-18T17:13:29Z)
A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning. These problems are often formalized as Bi-Level optimizations (BLO) We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z)
Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients. Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z)
Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML. This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z)
Adaptive Linear Estimating Equations [5.985204759362746]
In this paper, we propose a general method for constructing debiased estimator. It makes use of the idea of adaptive linear estimating equations, and we establish theoretical guarantees of normality. A salient feature of our estimator is that in the context of multi-armed bandits, our estimator retains the non-asymptotic performance.
arXiv Detail & Related papers (2023-07-14T12:55:47Z)
Accelerated stochastic approximation with state-dependent noise [7.4648480208501455]
We consider a class of smooth convex optimization problems under general assumptions on the quadratic noise in the gradient observation. Such problems naturally arise in a variety of applications, in particular, in the well-known generalized linear regression problem in statistics. We show that both SAGD and SGE, under appropriate conditions, achieve the optimal convergence rate.
arXiv Detail & Related papers (2023-07-04T06:06:10Z)
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent [10.108468796986074]
We study the online statistical inference of model parameters in a contextual bandit framework of decision-making. We propose a general framework for online and adaptive data collection environment that can update decision rules via weighted gradient descent.
arXiv Detail & Related papers (2022-12-30T18:57:08Z)
Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling. We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z)
Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics [18.93569692490218]
tuning of gradient algorithms often based on trial-and-error rather than generalizable theory. We show that averaging with a large fixed step size is robust to the choice of tuning parameters. We lay the foundation for a systematic analysis of other gradient Monte Carlo algorithms.
arXiv Detail & Related papers (2022-07-25T17:58:09Z)
Stochastic Approximation with Decision-Dependent Distributions: Asymptotic Normality and Optimality [8.771678221101368]
We analyze an approximation for decision-dependent problems, wherein the data distribution used by the algorithm evolves along the iterate sequence. We show that under mild assumptions, the deviation between the iterate of the algorithm and its solution isally normal. We also show that the performance of the algorithm with averaging is locally minimax optimal.
arXiv Detail & Related papers (2022-07-09T01:44:17Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms. We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients. We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z)
Statistical Inference for Model Parameters in Stochastic Gradient Descent [45.29532403359099]
gradient descent coefficients (SGD) has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency. We investigate the problem of statistical inference of true model parameters based on SGD when the population loss function is strongly convex and satisfies certain conditions.
arXiv Detail & Related papers (2016-10-27T07:04:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.