On Generalization Bounds for Projective Clustering
- URL: http://arxiv.org/abs/2310.09127v1
- Date: Fri, 13 Oct 2023 14:15:54 GMT
- Title: On Generalization Bounds for Projective Clustering
- Authors: Maria Sofia Bucarelli, Matilde Fjelds{\o} Larsen, Chris
Schwiegelshohn, Mads Bech Toftrup
- Abstract summary: Given a set of points, clustering consists of finding a partition of a point set into $k$ clusters such that the center to which a point is assigned is as close as possible.
For center-based objectives, we show a convergence rate of $tildeOleft(sqrtfrackj2nright)$.
For subspace clustering with $j$-dimensional subspaces, we show a convergence rate of $tildeOleft(sqrtfrackj2nright)$.
- Score: 3.4490841399255823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a set of points, clustering consists of finding a partition of a point
set into $k$ clusters such that the center to which a point is assigned is as
close as possible. Most commonly, centers are points themselves, which leads to
the famous $k$-median and $k$-means objectives. One may also choose centers to
be $j$ dimensional subspaces, which gives rise to subspace clustering. In this
paper, we consider learning bounds for these problems. That is, given a set of
$n$ samples $P$ drawn independently from some unknown, but fixed distribution
$\mathcal{D}$, how quickly does a solution computed on $P$ converge to the
optimal clustering of $\mathcal{D}$? We give several near optimal results. In
particular,
For center-based objectives, we show a convergence rate of
$\tilde{O}\left(\sqrt{{k}/{n}}\right)$. This matches the known optimal bounds
of [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016]
and [Bartlett, Linder, and Lugosi, IEEE Trans. Inf. Theory 1998] for $k$-means
and extends it to other important objectives such as $k$-median.
For subspace clustering with $j$-dimensional subspaces, we show a convergence
rate of $\tilde{O}\left(\sqrt{\frac{kj^2}{n}}\right)$. These are the first
provable bounds for most of these problems. For the specific case of projective
clustering, which generalizes $k$-means, we show a convergence rate of
$\Omega\left(\sqrt{\frac{kj}{n}}\right)$ is necessary, thereby proving that the
bounds from [Fefferman, Mitter, and Narayanan, Journal of the Mathematical
Society 2016] are essentially optimal.
Related papers
- Clustering to Minimize Cluster-Aware Norm Objectives [0.3481985817302898]
We seek to partition a given set $P$ of data points into $k$ clusters by finding a set $X$ of $k$ centers.
The cost of a cluster, represented by a center $xin X$, is a monotone, symmetric norm $f$ (inner norm) of the vector of distances of points assigned to $x$.
The goal is to minimize a norm $g$ (outer norm) of the vector of cluster costs.
arXiv Detail & Related papers (2024-10-31T16:33:40Z) - Revisiting Step-Size Assumptions in Stochastic Approximation [1.3654846342364308]
The paper revisits step-size selection in a general Markovian setting.
A major conclusion is that the choice of $rho =0$ or even $rho1/2$ is justified only in select settings.
arXiv Detail & Related papers (2024-05-28T05:11:05Z) - Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications.
We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions.
Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z) - A Unified Framework for Gradient-based Clustering of Distributed Data [51.904327888475606]
We develop a family of distributed clustering algorithms that work over networks of users.
DGC-$mathcalF_rho$ is specialized to popular clustering losses like $K$-means and Huber loss.
We show that consensus fixed points of DGC-$mathcalF_rho$ are equivalent to fixed points of gradient clustering over the full data.
arXiv Detail & Related papers (2024-02-02T10:44:42Z) - Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets.
We present an improved version of the "$q$-means" algorithm for clustering.
We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces [2.687607197645453]
We show that even the special case of $k$-Center in dimension $Theta(log n)$ is $(sqrt3/2- o(1))$hard to approximate for FPT algorithms.
We also show that even the special case of $k$-Center in dimension $Theta(log n)$ is $(sqrt3/2- o(1))$hard to approximate for FPT algorithms.
arXiv Detail & Related papers (2023-05-12T08:43:28Z) - Improved Learning-augmented Algorithms for k-means and k-medians
Clustering [8.04779839951237]
We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space.
We propose a deterministic $k$-means algorithm that produces centers with improved bound on clustering cost.
Our algorithm works even when the predictions are not very accurate, i.e. our bound holds for $alpha$ up to $1/2$, an improvement over $alpha$ being at most $1/7$ in the previous work.
arXiv Detail & Related papers (2022-10-31T03:00:11Z) - Near-optimal fitting of ellipsoids to random points [68.12685213894112]
A basic problem of fitting an ellipsoid to random points has connections to low-rank matrix decompositions, independent component analysis, and principal component analysis.
We resolve this conjecture up to logarithmic factors by constructing a fitting ellipsoid for some $n = Omega(, d2/mathrmpolylog(d),)$.
Our proof demonstrates feasibility of the least squares construction of Saunderson et al. using a convenient decomposition of a certain non-standard random matrix.
arXiv Detail & Related papers (2022-08-19T18:00:34Z) - FriendlyCore: Practical Differentially Private Aggregation [67.04951703461657]
We propose a simple and practical tool $mathsfFriendlyCore$ that takes a set of points $cal D$ from an unrestricted (pseudo) metric space as input.
When $cal D$ has effective diameter $r$, $mathsfFriendlyCore$ returns a "stable" subset $cal D_Gsubseteq cal D$ that includes all points.
$mathsfFriendlyCore$ can be used to preprocess the input before privately aggregating it, potentially simplifying the aggregation or boosting its accuracy
arXiv Detail & Related papers (2021-10-19T17:43:50Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Local Correlation Clustering with Asymmetric Classification Errors [12.277755088736864]
In the Correlation Clustering problem, we are given a complete weighted graph $G$ with its edges labeled as "similar" and "dissimilar"
For a clustering $mathcalC$ of graph $G$, a similar edge is in disagreement with $mathcalC$, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with $mathcalC$ if its endpoints belong to the same cluster.
We produce a clustering that minimizes the $ell_p$ norm of the disagreements vector for $pgeq 1$
arXiv Detail & Related papers (2021-08-11T12:31:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.