Sublinear Time Algorithms for Several Geometric Optimization (With
Outliers) Problems In Machine Learning
- URL: http://arxiv.org/abs/2301.02870v1
- Date: Sat, 7 Jan 2023 15:03:45 GMT
- Title: Sublinear Time Algorithms for Several Geometric Optimization (With
Outliers) Problems In Machine Learning
- Authors: Hu Ding
- Abstract summary: We revisit the Minimum Enclosing Ball (MEB) problem in Euclidean space $mathbbRd$.
We introduce the notion of stability for MEB, which is natural and easy to understand.
Under the stability assumption, we present two sampling algorithms for computing radius-approximate MEB.
- Score: 8.794896028549323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study several important geometric optimization problems
arising in machine learning. First, we revisit the Minimum Enclosing Ball (MEB)
problem in Euclidean space $\mathbb{R}^d$. The problem has been extensively
studied before, but real-world machine learning tasks often need to handle
large-scale datasets so that we cannot even afford linear time algorithms.
Motivated by the recent studies on {\em beyond worst-case analysis}, we
introduce the notion of stability for MEB, which is natural and easy to
understand. Roughly speaking, an instance of MEB is stable, if the radius of
the resulting ball cannot be significantly reduced by removing a small fraction
of the input points. Under the stability assumption, we present two sampling
algorithms for computing radius-approximate MEB with sample complexities
independent of the number of input points $n$. In particular, the second
algorithm has the sample complexity even independent of the dimensionality $d$.
We also consider the general case without the stability assumption. We present
a hybrid algorithm that can output either a radius-approximate MEB or a
covering-approximate MEB. Our algorithm improves the running time and the
number of passes for the previous sublinear MEB algorithms. Our method relies
on two novel techniques, the Uniform-Adaptive Sampling method and Sandwich
Lemma. Furthermore, we observe that these two techniques can be generalized to
design sublinear time algorithms for a broader range of geometric optimization
problems with outliers in high dimensions, including MEB with outliers,
one-class and two-class linear SVMs with outliers, $k$-center clustering with
outliers, and flat fitting with outliers. Our proposed algorithms also work
fine for kernels.
Related papers
- Closing the Computational-Query Depth Gap in Parallel Stochastic Convex Optimization [26.36906884097317]
We develop a new parallel algorithm for minimizing Lipschitz, convex functions with a subgradient oracle.
Our result closes a gap between the best-known query depth and the best-known computational depth of parallel algorithms.
arXiv Detail & Related papers (2024-06-11T15:41:48Z) - A GPU-Accelerated Bi-linear ADMM Algorithm for Distributed Sparse Machine Learning [4.258375398293221]
Bi-cADMM is aimed at solving large-scale regularized Sparse Machine Learning problems defined over a network of computational nodes.
Bi-cADMM is implemented within an open-source Python package called Parallel Sparse Fitting Toolbox.
arXiv Detail & Related papers (2024-05-25T15:11:34Z) - Randomized Algorithms for Symmetric Nonnegative Matrix Factorization [2.1753766244387402]
Symmetric Nonnegative Matrix Factorization (SymNMF) is a technique in data analysis and machine learning.
We develop two randomized algorithms for its computation.
We show that our methods approximately maintain solution quality and achieve significant speed ups for both large dense and large sparse problems.
arXiv Detail & Related papers (2024-02-13T00:02:05Z) - Efficient Algorithms for Sparse Moment Problems without Separation [6.732901486505047]
We consider the sparse moment problem of learning a $k$-spike mixture in high-dimensional space from noisy moment information.
Previous algorithms either assume certain separation assumptions, use more recovery moments, or run in (super) exponential time.
Our algorithm for the high-dimensional case determines the coordinates of each spike by aligning a 1d projection of the mixture to a random vector and a set of 2d projections of the mixture.
arXiv Detail & Related papers (2022-07-26T16:17:32Z) - On a class of geodesically convex optimization problems solved via
Euclidean MM methods [50.428784381385164]
We show how a difference of Euclidean convexization functions can be written as a difference of different types of problems in statistics and machine learning.
Ultimately, we helps the broader broader the broader the broader the broader the work.
arXiv Detail & Related papers (2022-06-22T23:57:40Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Covariance-Free Sparse Bayesian Learning [62.24008859844098]
We introduce a new SBL inference algorithm that avoids explicit inversions of the covariance matrix.
Our method can be up to thousands of times faster than existing baselines.
We showcase how our new algorithm enables SBL to tractably tackle high-dimensional signal recovery problems.
arXiv Detail & Related papers (2021-05-21T16:20:07Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.