Related papers: Constant Approximation for Individual Preference Stable Clustering

Constant Approximation for Individual Preference Stable Clustering

URL: http://arxiv.org/abs/2309.16840v1
Date: Thu, 28 Sep 2023 20:42:46 GMT
Title: Constant Approximation for Individual Preference Stable Clustering
Authors: Anders Aamand, Justin Y. Chen, Allen Liu, Sandeep Silwal, Pattara Sukprasert, Ali Vakilian, Fred Zhang
Abstract summary: Individual preference (IP) stability is a natural clustering objective inspired by stability and fairness constraints. It was unknown if an $o(n)$-IP stable clustering always emphexists, as the prior state of the art only guaranteed an $O(n)$-IP stable clustering. We show that an $O(1)$-IP stable clustering always exists for general metrics, and we give an efficient algorithm which outputs such a clustering.
Score: 26.716316367717585
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Individual preference (IP) stability, introduced by Ahmadi et al. (ICML 2022), is a natural clustering objective inspired by stability and fairness constraints. A clustering is $\alpha$-IP stable if the average distance of every data point to its own cluster is at most $\alpha$ times the average distance to any other cluster. Unfortunately, determining if a dataset admits a $1$-IP stable clustering is NP-Hard. Moreover, before this work, it was unknown if an $o(n)$-IP stable clustering always \emph{exists}, as the prior state of the art only guaranteed an $O(n)$-IP stable clustering. We close this gap in understanding and show that an $O(1)$-IP stable clustering always exists for general metrics, and we give an efficient algorithm which outputs such a clustering. We also introduce generalizations of IP stability beyond average distance and give efficient, near-optimal algorithms in the cases where we consider the maximum and minimum distances within and between clusters.

Related papers

Fixed-sized clusters $k$-Means [1.7034813545878587]
A straightforward application is balanced clustering, where the sizes of each cluster are equal. In the $k$-means assignment phase, the algorithm solves an assignment problem using the Hungarian algorithm.
arXiv Detail & Related papers (2025-01-27T15:04:35Z)
Exponentially Consistent Nonparametric Clustering of Data Streams [4.2603120588176635]
We consider nonparametric clustering of $M$ independent and identically distributed (i.i.d.) data streams generated from unknown distributions. Existing results on nonparametric clustering algorithms, like single linkage-based (SLINK) clustering and $k$-medoids distribution clustering, assume that the maximum intra-cluster distance ($d_L$) is smaller than the minimum inter-cluster distance ($d_H$)
arXiv Detail & Related papers (2024-11-21T08:18:04Z)
Scalable Algorithms for Individual Preference Stable Clustering [8.01184332330228]
In this paper, we study the natural local search algorithm for IP stable clustering. Our analysis confirms a $O(log n)$-IP stability guarantee for this algorithm, where $n$ denotes the number of points in the input.
arXiv Detail & Related papers (2024-03-15T14:58:27Z)
Dependent Cluster Mapping (DCMAP): Optimal clustering of directed acyclic graphs for statistical inference [0.0]
A Directed Acyclic Graph (DAG) can be partitioned or mapped into clusters to support inference. We propose a novel algorithm called DCMAP for optimal cluster mapping with dependent clusters. For a 25 and 50-node DBN, the search space size was $9.91times 109$ and $1.51times1021$ possible cluster mappings, and the first optimal solution was found at 934 $(text95% CI 926,971)$.
arXiv Detail & Related papers (2023-08-08T01:01:37Z)
Differentially Private Clustering in Data Streams [65.78882209673885]
We present a differentially private streaming clustering framework which only requires an offline DP coreset or clustering algorithm as a blackbox. Our framework is also differentially private under the continual release setting, i.e., the union of outputs of our algorithms at every timestamp is always differentially private.
arXiv Detail & Related papers (2023-07-14T16:11:22Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Differentially-Private Hierarchical Clustering with Provable Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering. We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset. We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z)
Individual Preference Stability for Clustering [24.301650646957246]
We propose a natural notion of individual preference (IP) stability for clustering. Our notion asks that every data point, on average, is closer to the points in its own cluster than to the points in any other cluster.
arXiv Detail & Related papers (2022-07-07T22:01:01Z)
Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points. We provide implementable differentially private clustering algorithms that provide utility when the data is "easy" We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z)
Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem. We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z)
K-expectiles clustering [0.0]
We propose a novel partitioning clustering algorithm based on expectiles. We suggest two schemes: fixed $tau$ clustering, and adaptive $tau$ clustering.
arXiv Detail & Related papers (2021-03-16T21:14:56Z)
Stable and consistent density-based clustering via multiparameter persistence [77.34726150561087]
We consider the degree-Rips construction from topological data analysis. We analyze its stability to perturbations of the input data using the correspondence-interleaving distance. We integrate these methods into a pipeline for density-based clustering, which we call Persistable.
arXiv Detail & Related papers (2020-05-18T19:45:04Z)
Explainable $k$-Means and $k$-Medians Clustering [25.513261099927163]
We consider using a small decision tree to partition a data set into clusters, so that clusters can be characterized in a straightforward manner. We show that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost. We design an efficient algorithm that produces explainable clusters using a tree with $k$ leaves.
arXiv Detail & Related papers (2020-02-28T04:21:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.