Partial Optimality in Cubic Correlation Clustering for General Graphs
- URL: http://arxiv.org/abs/2510.20431v1
- Date: Thu, 23 Oct 2025 11:07:29 GMT
- Title: Partial Optimality in Cubic Correlation Clustering for General Graphs
- Authors: David Stein, Bjoern Andres, Silvia Di Gregorio,
- Abstract summary: The higher-order correlation clustering problem for a graph $G$ and costs associated with cliques of $G$ consists in finding a clustering of $G$ so as to minimize the sum of the costs of those cliques whose nodes all belong to the same cluster.<n>To tackle this NP-hard problem in practice, local searchs have been proposed and studied in the context of applications.<n>Here, we establish partial optimality conditions for cubic correlation clustering, i.e., for the special case of at most 3-cliques.
- Score: 10.539905433361811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The higher-order correlation clustering problem for a graph $G$ and costs associated with cliques of $G$ consists in finding a clustering of $G$ so as to minimize the sum of the costs of those cliques whose nodes all belong to the same cluster. To tackle this NP-hard problem in practice, local search heuristics have been proposed and studied in the context of applications. Here, we establish partial optimality conditions for cubic correlation clustering, i.e., for the special case of at most 3-cliques. We define and implement algorithms for deciding these conditions and examine their effectiveness numerically, on two data sets.
Related papers
- Almost Asymptotically Optimal Active Clustering Through Pairwise Observations [59.20614082241528]
We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses.<n>We establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the accuracy of the clustering.<n>We develop a computationally feasible variant of the Generalized Likelihood Ratio statistic and show that its performance gap to the lower bound can be accurately empirically estimated.
arXiv Detail & Related papers (2026-02-05T14:16:47Z) - Fair Clustering for Data Summarization: Improved Approximation Algorithms and Complexity Insights [16.120911591795295]
In some applications all data points can be chosen as centers, in the general setting, centers must be chosen from a subset of points, referred as facilities or suppliers.
In this work, we focus on fair data summarization modeled as the fair $k$-supplier problem, where data consists of several groups, and a minimum number of centers must be selected from each group.
arXiv Detail & Related papers (2024-10-16T18:00:19Z) - Evolution of $K$-means solution landscapes with the addition of dataset
outliers and a robust clustering comparison measure for their analysis [0.0]
We use the energy landscape approach to map the change in $K$-means solution space as a result of increasing dataset outliers.
Kinetic analysis reveals that in all cases the overall funnel is composed of shallow locally-funnelled regions.
We propose that the rates obtained from kinetic analysis provide a novel measure of clustering similarity.
arXiv Detail & Related papers (2023-06-25T21:22:21Z) - Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [85.51611950757643]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.<n>IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Partial Optimality in Cubic Correlation Clustering [6.372499127940625]
Certifying optimality is NP-hard and practically hampered already by the complexity of the problem statement.
Here, we focus on establishing partial optimality conditions for the special case of complete graphs and cubic objective functions.
In addition, we define and implement algorithms for testing these conditions and examine their effect numerically, on two datasets.
arXiv Detail & Related papers (2023-02-09T15:25:52Z) - Near-Optimal Correlation Clustering with Privacy [37.94795032297396]
Correlation clustering is a central problem in unsupervised learning.
In this paper, we introduce a simple and computationally efficient algorithm for the correlation clustering problem with provable privacy guarantees.
arXiv Detail & Related papers (2022-03-02T22:30:19Z) - Differentially-Private Clustering of Easy Instances [67.04951703461657]
In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points.
We provide implementable differentially private clustering algorithms that provide utility when the data is "easy"
We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results.
arXiv Detail & Related papers (2021-12-29T08:13:56Z) - Solving correlation clustering with QAOA and a Rydberg qudit system: a
full-stack approach [94.37521840642141]
We study the correlation clustering problem using the quantum approximate optimization algorithm (QAOA) and qudits.
Specifically, we consider a neutral atom quantum computer and propose a full stack approach for correlation clustering.
We show the qudit implementation is superior to the qubit encoding as quantified by the gate count.
arXiv Detail & Related papers (2021-06-22T11:07:38Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering.
We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure.
We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z) - Efficient Algorithms for Generating Provably Near-Optimal Cluster
Descriptors for Explainability [36.11663695534294]
We study the problem of making clusters more interpretable by extending a recent approach of constructing succinct representations for clusters.
We develop approximation algorithms with provable performance guarantees for the problem.
We also show applications to explain clusters from datasets, including clusters of genomic sequences that represent different threat levels.
arXiv Detail & Related papers (2020-02-06T19:49:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.