LSEC: Large-scale spectral ensemble clustering
- URL: http://arxiv.org/abs/2106.09852v1
- Date: Fri, 18 Jun 2021 00:42:03 GMT
- Title: LSEC: Large-scale spectral ensemble clustering
- Authors: Hongmin Li, Xiucai Ye, Akira Imakura and Tetsuya Sakurai
- Abstract summary: We propose a large-scale spectral ensemble clustering (LSEC) method to strike a good balance between efficiency and effectiveness.
The LSEC method achieves a lower computational complexity than most existing ensemble clustering methods.
- Score: 8.545202841051582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensemble clustering is a fundamental problem in the machine learning field,
combining multiple base clusterings into a better clustering result. However,
most of the existing methods are unsuitable for large-scale ensemble clustering
tasks due to the efficiency bottleneck. In this paper, we propose a large-scale
spectral ensemble clustering (LSEC) method to strike a good balance between
efficiency and effectiveness. In LSEC, a large-scale spectral clustering based
efficient ensemble generation framework is designed to generate various base
clusterings within a low computational complexity. Then all based clustering
are combined through a bipartite graph partition based consensus function into
a better consensus clustering result. The LSEC method achieves a lower
computational complexity than most existing ensemble clustering methods.
Experiments conducted on ten large-scale datasets show the efficiency and
effectiveness of the LSEC method. The MATLAB code of the proposed method and
experimental datasets are available at https://github.com/Li-
Hongmin/MyPaperWithCode.
Related papers
- Scalable Co-Clustering for Large-Scale Data through Dynamic Partitioning and Hierarchical Merging [7.106620444966807]
Co-clustering simultaneously clusters rows and columns, revealing more fine-grained groups.
Existing co-clustering methods suffer from poor scalability and cannot handle large-scale data.
This paper presents a novel and scalable co-clustering method designed to uncover intricate patterns in high-dimensional, large-scale datasets.
arXiv Detail & Related papers (2024-10-09T04:47:22Z) - A3S: A General Active Clustering Method with Pairwise Constraints [66.74627463101837]
A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm.
In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries.
arXiv Detail & Related papers (2024-07-14T13:37:03Z) - One-Step Late Fusion Multi-view Clustering with Compressed Subspace [29.02032034647922]
We propose an integrated framework named One-Step Late Fusion Multi-view Clustering with Compressed Subspace (OS-LFMVC-CS)
We use the consensus subspace to align the partition matrix while optimizing the partition fusion, and utilize the fused partition matrix to guide the learning of discrete labels.
arXiv Detail & Related papers (2024-01-03T06:18:30Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - DivClust: Controlling Diversity in Deep Clustering [47.85350249697335]
DivClust produces consensus clustering solutions that consistently outperform single-clustering baselines.
Our method effectively controls diversity across frameworks and datasets with very small additional computational cost.
arXiv Detail & Related papers (2023-04-03T14:45:43Z) - Convex Clustering through MM: An Efficient Algorithm to Perform
Hierarchical Clustering [1.0589208420411012]
We propose convex clustering through majorization-minimization ( CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme.
With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space.
arXiv Detail & Related papers (2022-11-03T15:07:51Z) - Late Fusion Multi-view Clustering via Global and Local Alignment
Maximization [61.89218392703043]
Multi-view clustering (MVC) optimally integrates complementary information from different views to improve clustering performance.
Most of existing approaches directly fuse multiple pre-specified similarities to learn an optimal similarity matrix for clustering.
We propose late fusion MVC via alignment to address these issues.
arXiv Detail & Related papers (2022-08-02T01:49:31Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - Divide-and-conquer based Large-Scale Spectral Clustering [8.545202841051582]
We propose a divide-and-conquer based large-scale spectral clustering method to strike a good balance between efficiency and effectiveness.
The proposed method achieves lower computational complexity than most existing large-scale spectral clustering.
arXiv Detail & Related papers (2021-04-30T15:09:45Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.