Computational Hardness of Private Coreset
- URL: http://arxiv.org/abs/2602.17488v1
- Date: Thu, 19 Feb 2026 15:58:49 GMT
- Title: Computational Hardness of Private Coreset
- Authors: Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi,
- Abstract summary: For a given input set of points, a coreset is another set of points such that the $k$-means objective for any candidate solution is preserved up to a multiplicative $(, 1/n(1))$ factor (and some additive factor)<n>We show that no-time $(, 1/n(1))$-DP algorithm can compute a coreset for $k$-means in the $ell_infty$-metric for some constant $> 0$ (and some constant additive factor)<n>For $k$-means in the
- Score: 84.99100741615423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of differentially private (DP) computation of coreset for the $k$-means objective. For a given input set of points, a coreset is another set of points such that the $k$-means objective for any candidate solution is preserved up to a multiplicative $(1 \pm α)$ factor (and some additive factor). We prove the first computational lower bounds for this problem. Specifically, assuming the existence of one-way functions, we show that no polynomial-time $(ε, 1/n^{ω(1)})$-DP algorithm can compute a coreset for $k$-means in the $\ell_\infty$-metric for some constant $α> 0$ (and some constant additive factor), even for $k=3$. For $k$-means in the Euclidean metric, we show a similar result but only for $α= Θ\left(1/d^2\right)$, where $d$ is the dimension.
Related papers
- Linear time small coresets for k-mean clustering of segments with applications [4.759823735082844]
We study the $k$-means problem for a set $mathcalS subseteq mathbbRd$ of $n$ segments.<n>For any $varepsilon > 0$, an $varepsilon$-coreset is a weighted subset $C subseteq mathbbRd$ that approximates $D(mathcalS,X)$ within a factor of $1 pm varepsilon$ for any set of $k$ centers.
arXiv Detail & Related papers (2025-11-16T11:48:55Z) - Guessing Efficiently for Constrained Subspace Approximation [49.83981776254246]
We introduce a general framework for constrained subspace approximation.<n>We show it provides new algorithms for partition-constrained subspace approximation with applications to $k$-means clustering, and projected non-negative matrix factorization.
arXiv Detail & Related papers (2025-04-29T15:56:48Z) - Ridge Leverage Score Sampling for $\ell_p$ Subspace Approximation [47.790126028106734]
A popular approach to cope with the NP-hardness is to compute a strong coreset.<n>We obtain an algorithm for constructing a strong coreset for $ell_p$ subspace approximation of size $tilde O(kepsilon-4/p)$ for $p2$ and $tilde O(kp/2epsilon-p)$ for $p>2$.
arXiv Detail & Related papers (2024-07-03T16:49:28Z) - Do you know what q-means? [42.96240569413475]
We present a classical $varepsilon$-$k$-means algorithm that performs an approximate version of one iteration of Lloyd's algorithm with time complexity.<n>We also propose an improved $q$-means quantum algorithm with time complexity.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Differentially Private Clustering in Data Streams [56.26040303056582]
We provide the first differentially private algorithms for $k$-means and $k$-median clustering of $d$-dimensional Euclidean data points over a stream with length at most $T$.<n>Our main technical contribution is a differentially private clustering framework for data streams which only requires an offline DP coreset or clustering algorithm as a blackbox.
arXiv Detail & Related papers (2023-07-14T16:11:22Z) - Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces [2.687607197645453]
We show that even the special case of $k$-Center in dimension $Theta(log n)$ is $(sqrt3/2- o(1))$hard to approximate for FPT algorithms.
We also show that even the special case of $k$-Center in dimension $Theta(log n)$ is $(sqrt3/2- o(1))$hard to approximate for FPT algorithms.
arXiv Detail & Related papers (2023-05-12T08:43:28Z) - Towards Optimal Lower Bounds for k-median and k-means Coresets [25.713987341159918]
Given a set of points in a metric space, the $(k,z)$-clustering problem consists of finding a set of $k$ points called centers.
We show that any coreset for $(k,z)$ clustering must consist of at least $Omega(k varepsilon-2 log n)$ and $Omega(k varepsilon-2 D)$ points.
arXiv Detail & Related papers (2022-02-25T16:13:28Z) - FriendlyCore: Practical Differentially Private Aggregation [67.04951703461657]
We propose a simple and practical tool $mathsfFriendlyCore$ that takes a set of points $cal D$ from an unrestricted (pseudo) metric space as input.
When $cal D$ has effective diameter $r$, $mathsfFriendlyCore$ returns a "stable" subset $cal D_Gsubseteq cal D$ that includes all points.
$mathsfFriendlyCore$ can be used to preprocess the input before privately aggregating it, potentially simplifying the aggregation or boosting its accuracy
arXiv Detail & Related papers (2021-10-19T17:43:50Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - Locally Private $k$-Means Clustering with Constant Multiplicative
Approximation and Near-Optimal Additive Error [10.632986841188]
We bridge the gap between the exponents of $n$ in the upper and lower bounds on the additive error with two new algorithms.
It is possible to solve the locally private $k$-means problem in a constant number of rounds with constant factor multiplicative approximation.
arXiv Detail & Related papers (2021-05-31T14:41:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.