Coreset-based Strategies for Robust Center-type Problems
- URL: http://arxiv.org/abs/2002.07463v1
- Date: Tue, 18 Feb 2020 10:04:08 GMT
- Title: Coreset-based Strategies for Robust Center-type Problems
- Authors: Andrea Pietracaprina, Geppino Pucci, Federico Sold\`a
- Abstract summary: We devise coreset-based strategies for the two problems which yield efficient sequential, MapReduce, and Streaming algorithms.
For wide ranges of the parameters $k,zepsilon, D$, we obtain a sequential algorithm with running time linear in $|V|$, and MapReduce/Streaming algorithms with few rounds/passes and substantially sublinear local/working memory.
- Score: 0.6875312133832077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given a dataset $V$ of points from some metric space, the popular $k$-center
problem requires to identify a subset of $k$ points (centers) in $V$ minimizing
the maximum distance of any point of $V$ from its closest center. The
\emph{robust} formulation of the problem features a further parameter $z$ and
allows up to $z$ points of $V$ (outliers) to be disregarded when computing the
maximum distance from the centers. In this paper, we focus on two important
constrained variants of the robust $k$-center problem, namely, the Robust
Matroid Center (RMC) problem, where the set of returned centers are constrained
to be an independent set of a matroid of rank $k$ built on $V$, and the Robust
Knapsack Center (RKC) problem, where each element $i\in V$ is given a positive
weight $w_i<1$ and the aggregate weight of the returned centers must be at most
1. We devise coreset-based strategies for the two problems which yield
efficient sequential, MapReduce, and Streaming algorithms. More specifically,
for any fixed $\epsilon>0$, the algorithms return solutions featuring a
$(3+\epsilon)$-approximation ratio, which is a mere additive term $\epsilon$
away from the 3-approximations achievable by the best known polynomial-time
sequential algorithms for the two problems. Moreover, the algorithms
obliviously adapt to the intrinsic complexity of the dataset, captured by its
doubling dimension $D$. For wide ranges of the parameters $k,z,\epsilon, D$, we
obtain a sequential algorithm with running time linear in $|V|$, and
MapReduce/Streaming algorithms with few rounds/passes and substantially
sublinear local/working memory.
Related papers
- Nearly Linear Sparsification of $\ell_p$ Subspace Approximation [47.790126028106734]
A popular approach to cope with the NP-hardness of the $ell_p$ subspace approximation problem is to compute a strong coreset.
We obtain the first algorithm for constructing a strong coreset for $ell_p$ subspace approximation with a nearly optimal dependence on the rank parameter $k$.
Our techniques also lead to the first nearly optimal online strong coresets for $ell_p$ subspace approximation with similar bounds as the offline setting.
arXiv Detail & Related papers (2024-07-03T16:49:28Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Do you know what q-means? [50.045011844765185]
Clustering is one of the most important tools for analysis of large datasets.
We present an improved version of the "$q$-means" algorithm for clustering.
We also present a "dequantized" algorithm for $varepsilon which runs in $Obig(frack2varepsilon2(sqrtkd + log(Nd))big.
arXiv Detail & Related papers (2023-08-18T17:52:12Z) - Dynamic algorithms for k-center on graphs [3.568439282784197]
We give the first efficient algorithms for the $k$-center problem on dynamic graphs undergoing edge updates.
We show a reduction that leads to a fully dynamic $(2+epsilon)$-approximation algorithm for the $k$-center problem.
arXiv Detail & Related papers (2023-07-28T13:50:57Z) - Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces [2.687607197645453]
We show that even the special case of $k$-Center in dimension $Theta(log n)$ is $(sqrt3/2- o(1))$hard to approximate for FPT algorithms.
We also show that even the special case of $k$-Center in dimension $Theta(log n)$ is $(sqrt3/2- o(1))$hard to approximate for FPT algorithms.
arXiv Detail & Related papers (2023-05-12T08:43:28Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - Revisiting Priority $k$-Center: Fairness and Outliers [5.3898004059026325]
We develop a framework that yields constant factor approximation algorithms for Priority $k$-Center with outliers.
Our framework extends to generalizations of Priority $k$-Center to matroid and knapsack constraints.
arXiv Detail & Related papers (2021-03-04T21:15:37Z) - No-Substitution $k$-means Clustering with Low Center Complexity and
Memory [5.837881923712394]
We develop an algorithm with center complexity bounded by $Lower_36, k(X)$ that is a true $O(1)$-approximation with its approximation factor being independent of $k$ or $n$.
This algorithm is the first known algorithm with center complexity bounded by $Lower_36, k(X)$ that is a true $O(1)$-approximation with its approximation factor being independent of $k$ or $n$.
arXiv Detail & Related papers (2021-02-18T01:20:03Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Fixed-Support Wasserstein Barycenters: Computational Hardness and Fast
Algorithm [100.11971836788437]
We study the fixed-support Wasserstein barycenter problem (FS-WBP)
We develop a provably fast textitdeterministic variant of the celebrated iterative Bregman projection (IBP) algorithm, named textscFastIBP.
arXiv Detail & Related papers (2020-02-12T03:40:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.