k-means++: few more steps yield constant approximation
- URL: http://arxiv.org/abs/2002.07784v1
- Date: Tue, 18 Feb 2020 18:28:25 GMT
- Title: k-means++: few more steps yield constant approximation
- Authors: Davin Choo, Christoph Grunau, Julian Portmann, V\'aclav Rozho\v{n}
- Abstract summary: The k-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is a state-of-the-art algorithm for solving the k-means clustering problem.
Recently, Lattanzi and Sohler (ICML) proposed augmenting k-means++ with O(k log k) local search steps to yield a constant approximation (in expectation) to the k-means clustering problem.
- Score: 3.7468898363447654
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The k-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is a
state-of-the-art algorithm for solving the k-means clustering problem and is
known to give an O(log k)-approximation in expectation. Recently, Lattanzi and
Sohler (ICML 2019) proposed augmenting k-means++ with O(k log log k) local
search steps to yield a constant approximation (in expectation) to the k-means
clustering problem. In this paper, we improve their analysis to show that, for
any arbitrarily small constant $\eps > 0$, with only $\eps k$ additional local
search steps, one can achieve a constant approximation guarantee (with high
probability in k), resolving an open problem in their paper.
Related papers
- Multi-Swap $k$-Means++ [30.967186562175893]
The $k$-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is often the practitioners' choice algorithm for optimizing the popular $k$-means clustering objective.
Lattanzi and Sohler (ICML) proposed augmenting $k$-means++ with $O(k log log k)$ local search steps to yield a $c$-approximation to the $k$-means clustering problem.
arXiv Detail & Related papers (2023-09-28T12:31:35Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Sketch-and-solve approaches to k-means clustering by semidefinite
programming [14.930208990741132]
We introduce a sketch-and-solve approach to speed up the Peng-Wei semidefinite relaxation of k-means clustering.
If the data is appropriately separated we identify the k-means optimal clustering.
Otherwise, our approach provides a high-confidence lower bound on the optimal k-means value.
arXiv Detail & Related papers (2022-11-28T19:51:30Z) - Hessian Averaging in Stochastic Newton Methods Achieves Superlinear
Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method.
We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z) - Solving correlation clustering with QAOA and a Rydberg qudit system: a
full-stack approach [94.37521840642141]
We study the correlation clustering problem using the quantum approximate optimization algorithm (QAOA) and qudits.
Specifically, we consider a neutral atom quantum computer and propose a full stack approach for correlation clustering.
We show the qudit implementation is superior to the qubit encoding as quantified by the gate count.
arXiv Detail & Related papers (2021-06-22T11:07:38Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Improved Guarantees for k-means++ and k-means++ Parallel [18.26254785549146]
k-means++ and k-means++ parallel are popular algorithms for the classic k-means clustering problem.
We show improved approximation and bi-criteria approximation guarantees for k-means++ and k-means++ parallel.
We propose a new variant of k-means++ parallel algorithm (Exponential Race k-means++) that has the same approximation guarantees as k-means++.
arXiv Detail & Related papers (2020-10-27T17:46:00Z) - Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems.
Our results imply an improved algorithm for the Sample and Aggregate privacy framework.
One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z) - Breathing K-Means: Superior K-Means Solutions through Dynamic K-Values [0.0]
We introduce the breathing k-means algorithm, which on average significantly improves solutions obtained by the widely-known greedy k-means++ algorithm.
The improvements are achieved through a novel breathing'' technique, that cyclically increases and decreases the number of centroids based on local error and utility measures.
arXiv Detail & Related papers (2020-06-28T17:49:37Z) - k-sums: another side of k-means [34.317086730703124]
In this paper, the decades-old clustering method k-means is revisited.
The original distortion minimization model of k-means is addressed by a pure minimization procedure.
A new target function that minimizes pairwise distances within clusters is presented.
arXiv Detail & Related papers (2020-05-19T14:36:12Z) - Ball k-means [53.89505717006118]
The Ball k-means algorithm uses a ball to describe a cluster, focusing on reducing the point-centroid distance computation.
The fast speed, no extra parameters and simple design of the Ball k-means make it an all-around replacement of the naive k-means algorithm.
arXiv Detail & Related papers (2020-05-02T10:39:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.