Adaptive and Oblivious Randomized Subspace Methods for High-Dimensional
Optimization: Sharp Analysis and Lower Bounds
- URL: http://arxiv.org/abs/2012.07054v1
- Date: Sun, 13 Dec 2020 13:02:31 GMT
- Title: Adaptive and Oblivious Randomized Subspace Methods for High-Dimensional
Optimization: Sharp Analysis and Lower Bounds
- Authors: Jonathan Lacotte, Mert Pilanci
- Abstract summary: A suitable adaptive subspace can be generated by sampling a correlated random matrix whose second order statistics mirror the input data.
We show that the relative error of the randomized approximations can be tightly characterized in terms of the spectrum of the data matrix.
Experimental results show that the proposed approach enables significant speed ups in a wide variety of machine learning and optimization problems.
- Score: 37.03247707259297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose novel randomized optimization methods for high-dimensional convex
problems based on restrictions of variables to random subspaces. We consider
oblivious and data-adaptive subspaces and study their approximation properties
via convex duality and Fenchel conjugates. A suitable adaptive subspace can be
generated by sampling a correlated random matrix whose second order statistics
mirror the input data. We illustrate that the adaptive strategy can
significantly outperform the standard oblivious sampling method, which is
widely used in the recent literature. We show that the relative error of the
randomized approximations can be tightly characterized in terms of the spectrum
of the data matrix and Gaussian width of the dual tangent cone at optimum. We
develop lower bounds for both optimization and statistical error measures based
on concentration of measure and Fano's inequality. We then present the
consequences of our theory with data matrices of varying spectral decay
profiles. Experimental results show that the proposed approach enables
significant speed ups in a wide variety of machine learning and optimization
problems including logistic regression, kernel classification with random
convolution layers and shallow neural networks with rectified linear units.
Related papers
- A Bayesian Approach Toward Robust Multidimensional Ellipsoid-Specific Fitting [0.0]
This work presents a novel and effective method for fitting multidimensional ellipsoids to scattered data in the contamination of noise and outliers.
We incorporate a uniform prior distribution to constrain the search for primitive parameters within an ellipsoidal domain.
We apply it to a wide range of practical applications such as microscopy cell counting, 3D reconstruction, geometric shape approximation, and magnetometer calibration tasks.
arXiv Detail & Related papers (2024-07-27T14:31:51Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Multistage Stochastic Optimization via Kernels [3.7565501074323224]
We develop a non-parametric, data-driven, tractable approach for solving multistage optimization problems.
We show that the proposed method produces decision rules with near-optimal average performance.
arXiv Detail & Related papers (2023-03-11T23:19:32Z) - Stochastic Mirror Descent for Large-Scale Sparse Recovery [13.500750042707407]
We discuss an application of quadratic Approximation to statistical estimation of high-dimensional sparse parameters.
We show that the proposed algorithm attains the optimal convergence of the estimation error under weak assumptions on the regressor distribution.
arXiv Detail & Related papers (2022-10-23T23:23:23Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - A similarity-based Bayesian mixture-of-experts model [0.5156484100374058]
We present a new non-parametric mixture-of-experts model for multivariate regression problems.
Using a conditionally specified model, predictions for out-of-sample inputs are based on similarities to each observed data point.
Posterior inference is performed on the parameters of the mixture as well as the distance metric.
arXiv Detail & Related papers (2020-12-03T18:08:30Z) - Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function.
We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z) - To Each Optimizer a Norm, To Each Norm its Generalization [31.682969645989512]
We study the implicit regularization of optimization methods for linear models interpolating the training data in the under-parametrized and over-parametrized regimes.
We argue that analyzing convergence to the standard maximum l2-margin is arbitrary and show that minimizing the norm induced by the data results in better generalizations.
arXiv Detail & Related papers (2020-06-11T21:07:38Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.