Semi-Supervised Clustering via Structural Entropy with Different
Constraints
- URL: http://arxiv.org/abs/2312.10917v1
- Date: Mon, 18 Dec 2023 04:00:40 GMT
- Title: Semi-Supervised Clustering via Structural Entropy with Different
Constraints
- Authors: Guangjie Zeng, Hao Peng, Angsheng Li, Zhiwei Liu, Runze Yang, Chunyang
Liu, Lifang He
- Abstract summary: We present Semi-supervised clustering via Structural Entropy (SSE), a novel method that can incorporate different types of constraints from diverse sources to perform both partitioning and hierarchical clustering.
We evaluate SSE on nine clustering datasets and compare it with eleven semi-supervised partitioning and hierarchical clustering methods.
- Score: 30.215985625884922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi-supervised clustering techniques have emerged as valuable tools for
leveraging prior information in the form of constraints to improve the quality
of clustering outcomes. Despite the proliferation of such methods, the ability
to seamlessly integrate various types of constraints remains limited. While
structural entropy has proven to be a powerful clustering approach with
wide-ranging applications, it has lacked a variant capable of accommodating
these constraints. In this work, we present Semi-supervised clustering via
Structural Entropy (SSE), a novel method that can incorporate different types
of constraints from diverse sources to perform both partitioning and
hierarchical clustering. Specifically, we formulate a uniform view for the
commonly used pairwise and label constraints for both types of clustering.
Then, we design objectives that incorporate these constraints into structural
entropy and develop tailored algorithms for their optimization. We evaluate SSE
on nine clustering datasets and compare it with eleven semi-supervised
partitioning and hierarchical clustering methods. Experimental results
demonstrate the superiority of SSE on clustering accuracy with different types
of constraints. Additionally, the functionality of SSE for biological data
analysis is demonstrated by cell clustering experiments conducted on four
single-cell RNAseq datasets.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - A3S: A General Active Clustering Method with Pairwise Constraints [66.74627463101837]
A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm.
In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries.
arXiv Detail & Related papers (2024-07-14T13:37:03Z) - Towards Explainable Clustering: A Constrained Declarative based Approach [0.294944680995069]
We aim at finding a clustering that has high quality in terms of classic clustering criteria and that is explainable.
A good global explanation of a clustering should give the characteristics of each cluster taking into account their abilities to describe its objects.
We propose a novel interpretable constrained method called ECS for declarative computation with Explainabilty-driven Selection.
arXiv Detail & Related papers (2024-03-26T21:00:06Z) - Memetic Differential Evolution Methods for Semi-Supervised Clustering [0.8681835475119588]
We propose an extension for semi-supervised Minimum Sum-of-Squares Clustering (MSSC) problems of MDEClust.
Our new framework, called S-MDEClust, represents the first memetic methodology designed to generate an optimal feasible solution.
arXiv Detail & Related papers (2024-03-07T08:37:36Z) - Single-cell Multi-view Clustering via Community Detection with Unknown
Number of Clusters [64.31109141089598]
We introduce scUNC, an innovative multi-view clustering approach tailored for single-cell data.
scUNC seamlessly integrates information from different views without the need for a predefined number of clusters.
We conducted a comprehensive evaluation of scUNC using three distinct single-cell datasets.
arXiv Detail & Related papers (2023-11-28T08:34:58Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF)
Our model directly considers the between-view relationship and exploits the between-view complementary information.
In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z) - Interpretable Clustering via Multi-Polytope Machines [12.69310440882225]
We propose a novel approach for interpretable clustering that both clusters data points and constructs polytopes around the discovered clusters to explain them.
We benchmark our approach on a suite of synthetic and real world clustering problems, where our algorithm outperforms state of the art interpretable and non-interpretable clustering algorithms.
arXiv Detail & Related papers (2021-12-10T16:36:32Z) - Deep Conditional Gaussian Mixture Model for Constrained Clustering [7.070883800886882]
Constrained clustering can leverage prior information on a growing amount of only partially labeled data.
We propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of gradient variational inference.
arXiv Detail & Related papers (2021-06-11T13:38:09Z) - A Multi-disciplinary Ensemble Algorithm for Clustering Heterogeneous
Datasets [0.76146285961466]
We propose a new evolutionary clustering algorithm (ECAStar) based on social class ranking and meta-heuristic algorithms.
ECAStar is integrated with recombinational evolutionary operators, Levy flight optimisation, and some statistical techniques.
Experiments are conducted to evaluate the ECAStar against five conventional approaches.
arXiv Detail & Related papers (2021-01-01T07:20:50Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.