Related papers: Improved Algorithms for Overlapping and Robust Clustering of Edge-Colored Hypergraphs: An LP-Based Combinatorial Approach

Improved Algorithms for Overlapping and Robust Clustering of Edge-Colored Hypergraphs: An LP-Based Combinatorial Approach

URL: http://arxiv.org/abs/2505.18043v1
Date: Fri, 23 May 2025 15:46:16 GMT
Title: Improved Algorithms for Overlapping and Robust Clustering of Edge-Colored Hypergraphs: An LP-Based Combinatorial Approach
Authors: Changyeol Lee, Yongho Shin, Hyung-Chan An,
Abstract summary: edge-colored clustering (ECC) has emerged as a useful approach for handling categorical data.<n>To tackle these limitations, three versions of ECC have been studied.<n>We present an algorithmic framework that combines the strengths of LP with the computational efficiency of algorithms.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Clustering is a fundamental task in both machine learning and data mining. Among various methods, edge-colored clustering (ECC) has emerged as a useful approach for handling categorical data. Given a hypergraph with (hyper)edges labeled by colors, ECC aims to assign vertex colors to minimize the number of edges where the vertex color differs from the edge's color. However, traditional ECC has inherent limitations, as it enforces a nonoverlapping and exhaustive clustering. To tackle these limitations, three versions of ECC have been studied: Local ECC and Global ECC, which allow overlapping clusters, and Robust ECC, which accounts for vertex outliers. For these problems, both linear programming (LP) rounding algorithms and greedy combinatorial algorithms have been proposed. While these LP-rounding algorithms provide high-quality solutions, they demand substantial computation time; the greedy algorithms, on the other hand, run very fast but often compromise solution quality. In this paper, we present an algorithmic framework that combines the strengths of LP with the computational efficiency of combinatorial algorithms. Both experimental and theoretical analyses show that our algorithms efficiently produce high-quality solutions for all three problems: Local, Global, and Robust ECC. We complement our algorithmic contributions with complexity-theoretic inapproximability results and integrality gap bounds, which suggest that significant theoretical improvements are unlikely. Our results also answer two open questions previously raised in the literature.

Related papers

Exact and Heuristic Algorithms for Constrained Biclustering [0.0]
Biclustering, also known as co-clustering or two-way clustering, simultaneously partitions the rows and columns of a data matrix to reveal submatrices with coherent patterns.<n>We study constrained biclustering with pairwise constraints, namely must-link and cannot-link constraints, which specify whether objects should belong to the same or different biclusters.
arXiv Detail & Related papers (2025-08-07T15:29:22Z)
Accelerating Spectral Clustering under Fairness Constraints [56.865810822418744]
We present a new efficient method for fair spectral clustering (Fair SC) by casting the Fair SC problem within the difference of convex functions (DC) framework.<n>We show that each associated subproblem can be solved efficiently, resulting in higher computational efficiency compared to prior work.
arXiv Detail & Related papers (2025-06-09T18:46:27Z)
An island-parallel ensemble metaheuristic algorithm for large graph coloring problems [0.4915744683251149]
We propose a new island-parallel ensemble metaheuristic algorithm (PEM-Color) to solve large GCP instances.<n>To the best of our knowledge, this is the first study that combines metaheuristics and applies to the GCP using an ensemble approach.
arXiv Detail & Related papers (2025-04-21T13:15:23Z)
Tree-Guided $L_1$-Convex Clustering [1.0589208420411012]
We develop a novel convex clustering algorithm called Tree-Guided- Clustering.<n>We develop an efficient cluster fusion algorithm that utilizes the tree of the weights to accelerate the optimization process.<n>Remarkably, our TGCC algorithm can construct a complete dengram in $mathbbR2 seconds on a laptop standard.
arXiv Detail & Related papers (2025-03-31T12:39:48Z)
A Greedy Strategy for Graph Cut [95.2841574410968]
We propose a greedy strategy to solve the problem of Graph Cut, called GGC.<n>It starts from the state where each data sample is regarded as a cluster and dynamically merges the two clusters.<n>GGC has a nearly linear computational complexity with respect to the number of samples.
arXiv Detail & Related papers (2024-12-28T05:49:42Z)
Faster Optimal Coalition Structure Generation via Offline Coalition Selection and Graph-Based Search [61.08720171136229]
We present a novel algorithm, SMART, for the problem based on a hybridization of three innovative techniques. Two of these techniques are based on dynamic programming, where we show a powerful connection between the coalitions selected for evaluation and the performance of the algorithms. Our techniques bring a new way of approaching the problem and a new level of precision to the field.
arXiv Detail & Related papers (2024-07-22T23:24:03Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Efficient Convex Algorithms for Universal Kernel Learning [46.573275307034336]
An ideal set of kernels should: admit a linear parameterization (for tractability); dense in the set of all kernels (for accuracy) Previous algorithms for optimization of kernels were limited to classification and relied on computationally complex Semidefinite Programming (SDP) algorithms. We propose a SVD-QCQPQP algorithm which dramatically reduces the computational complexity as compared with previous SDP-based approaches.
arXiv Detail & Related papers (2023-04-15T04:57:37Z)
Randomized Greedy Algorithms and Composable Coreset for k-Center Clustering with Outliers [11.546734084378683]
The presence of outliers can significantly increase the computational complexity. Our idea is inspired by the greedy method, that was developed for solving the ordinary $k$-center clustering problem.
arXiv Detail & Related papers (2023-01-07T09:26:01Z)
An algorithm for clustering with confidence-based must-link and cannot-link constraints [0.0]
We introduce the PCCC (Pairwise-Confidence-Constraints-Clustering) algorithm. The PCCC algorithm iteratively assigns objects to clusters while accounting for the information provided on the pairs of objects. Unlike existing algorithms, our algorithm scales to large-scale instances with up to 60,000 objects, 100 clusters, and millions of cannot-link constraints.
arXiv Detail & Related papers (2022-12-29T19:21:33Z)
A Penalty-Based Method for Communication-Efficient Decentralized Bilevel Programming [14.35928967799696]
This paper introduces a penalty function-based decentralized algorithm for solving bilevel programming problems over a decentralized network. A key feature of the proposed algorithm is the estimation of the hyper-gradient of the penalty function. Our theoretical framework ensures non-asymptotic convergence to the optimal solution of the original problem under various convexity conditions.
arXiv Detail & Related papers (2022-11-08T08:39:30Z)
A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs [91.07247251502564]
We propose a hybrid approach to combine the best of the two worlds, in which a bi-level framework is developed with an upper-level learning method to optimize the graph. Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity.
arXiv Detail & Related papers (2021-06-09T09:18:18Z)
Differentially Private Clustering: Tight Approximation Ratios [57.89473217052714]
We give efficient differentially private algorithms for basic clustering problems. Our results imply an improved algorithm for the Sample and Aggregate privacy framework. One of the tools used in our 1-Cluster algorithm can be employed to get a faster quantum algorithm for ClosestPair in a moderate number of dimensions.
arXiv Detail & Related papers (2020-08-18T16:22:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.