Related papers: Private optimization in the interpolation regime: faster rates and hardness results

Private optimization in the interpolation regime: faster rates and hardness results

URL: http://arxiv.org/abs/2210.17070v1
Date: Mon, 31 Oct 2022 05:18:27 GMT
Title: Private optimization in the interpolation regime: faster rates and hardness results
Authors: Hilal Asi, Karan Chadha, Gary Cheng, John Duchi
Abstract summary: In non-private convex optimization, gradient methods converge much faster on problems -- problems where there exists a solution that simultaneously minimizes all of the sample losses -- than on non-interpolating ones. We show (near) exponential improvements in the private sample complexity.
Score: 9.405458160620533
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In non-private stochastic convex optimization, stochastic gradient methods converge much faster on interpolation problems -- problems where there exists a solution that simultaneously minimizes all of the sample losses -- than on non-interpolating ones; we show that generally similar improvements are impossible in the private setting. However, when the functions exhibit quadratic growth around the optimum, we show (near) exponential improvements in the private sample complexity. In particular, we propose an adaptive algorithm that improves the sample complexity to achieve expected error $\alpha$ from $\frac{d}{\varepsilon \sqrt{\alpha}}$ to $\frac{1}{\alpha^\rho} + \frac{d}{\varepsilon} \log\left(\frac{1}{\alpha}\right)$ for any fixed $\rho >0$, while retaining the standard minimax-optimal sample complexity for non-interpolation problems. We prove a lower bound that shows the dimension-dependent term is tight. Furthermore, we provide a superefficiency result which demonstrates the necessity of the polynomial term for adaptive algorithms: any algorithm that has a polylogarithmic sample complexity for interpolation problems cannot achieve the minimax-optimal rates for the family of non-interpolation problems.

Related papers

Accelerated Stochastic Min-Max Optimization Based on Bias-corrected Momentum [30.01198677588252]
First-order algorithms require at least $mathcalO(varepsilonepsilon-4)$ complexity to find an $varepsilon-stationary point. We introduce novel momentum algorithms utilizing efficient variable complexity. The effectiveness of the method is validated through robust logistic regression using real-world datasets.
arXiv Detail & Related papers (2024-06-18T20:14:52Z)
An Accelerated Gradient Method for Convex Smooth Simple Bilevel Optimization [16.709026203727007]
We present a novel bilevel optimization method that locally approximates the solution set of the lower-level problem. We measure the performance of our method in terms of suboptimality and infeasibility errors.
arXiv Detail & Related papers (2024-02-12T22:34:53Z)
Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization [6.896308995955336]
Gradient-based minimax optimal algorithms have promoted the development of continuous optimization and machine learning. In this paper, we open up a new way to design and analyze gradient-based algorithms with direct applications in machine learning.
arXiv Detail & Related papers (2023-12-06T01:16:10Z)
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization [55.115992622028685]
Previous work suggests that first-order methods would need to trade-off convergence rate (gradient convergence rate) for better. We demonstrate that both optimal complexity and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems.
arXiv Detail & Related papers (2023-10-26T19:56:52Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization. We provide novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z)
Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems. Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z)
Optimal Algorithms for Stochastic Multi-Level Compositional Optimization [46.77664277596764]
We solve the problem of multi-level compositional optimization, where the objective function is a limitation of multiple but possibly non-optimal functions. To make use of the Adaptive Multi-level Variance Reduction method (SMVR), we also achieves the same optimal complexities but converges faster in practice.
arXiv Detail & Related papers (2022-02-15T16:02:32Z)
Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth Problems [45.88640334610308]
This paper presents a new distributed optimization algorithm for non-smooth problems. We show that the proposed algorithm can achieve an overcal communication. Experiments are presented to illustrate the effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-10T13:12:21Z)
Escaping Saddle-Points Faster under Interpolation-like Conditions [19.9471360853892]
We show that under over-parametrization several standard optimization algorithms escape saddle-points and converge to local-minimizers much faster. We discuss the first-order oracle complexity of Perturbed Gradient Descent (PSGD) algorithm to reach an $epsilon$ localminimizer. We next analyze Cubic-Regularized Newton (SCRN) algorithm under-like conditions, and show that the oracle complexity to reach an $epsilon$ local-minimizer under-like conditions, is $tildemathcalO (1/epsilon2.5
arXiv Detail & Related papers (2020-09-28T02:15:18Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
SGD with shuffling: optimal rates without component convexity and large epoch requirements [60.65928290219793]
We consider the RandomShuffle (shuffle at the beginning of each epoch) and SingleShuffle (shuffle only once) We establish minimax optimal convergence rates of these algorithms up to poly-log factor gaps. We further sharpen the tight convergence results for RandomShuffle by removing the drawbacks common to all prior arts.
arXiv Detail & Related papers (2020-06-12T05:00:44Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples. We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.