Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization
- URL: http://arxiv.org/abs/2002.02846v1
- Date: Fri, 7 Feb 2020 15:32:14 GMT
- Title: Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization
- Authors: Li Chen, Shuisheng Zhou, Jiajun Ma
- Abstract summary: Kernel-based clustering algorithm can identify and capture the non-linear structure in datasets.
It can achieve better performance than linear clustering.
computing and storing the entire kernel matrix occupy so large memory that it is difficult for kernel-based clustering to deal with large-scale datasets.
- Score: 11.631064399465089
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Kernel-based clustering algorithm can identify and capture the non-linear
structure in datasets, and thereby it can achieve better performance than
linear clustering. However, computing and storing the entire kernel matrix
occupy so large memory that it is difficult for kernel-based clustering to deal
with large-scale datasets. In this paper, we employ incomplete Cholesky
factorization to accelerate kernel clustering and save memory space. The key
idea of the proposed kernel $k$-means clustering using incomplete Cholesky
factorization is that we approximate the entire kernel matrix by the product of
a low-rank matrix and its transposition. Then linear $k$-means clustering is
applied to columns of the transpose of the low-rank matrix. We show both
analytically and empirically that the performance of the proposed algorithm is
similar to that of the kernel $k$-means clustering algorithm, but our method
can deal with large-scale datasets.
Related papers
- Multiple kernel concept factorization algorithm based on global fusion [9.931283387968856]
In unsupervised environment, to design or select proper kernel function for specific dataset, a new algorithm called Globalized Multiple Kernel(GMKCF)was proposed.
The proposed algorithm outperforms comparison algorithms in data clustering, such as Kernel K-Means(KKM), Spectral Clustering(SC), CF Kernel(KCF), Co-regularized multi-view spectral clustering(Coreg), and Robust Multiple KKM(RMKKM)
arXiv Detail & Related papers (2024-10-27T09:13:57Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Local Sample-weighted Multiple Kernel Clustering with Consensus
Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.
This paper proposes a novel local sample-weighted multiple kernel clustering model.
Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z) - k-Factorization Subspace Clustering [12.18340575383456]
Subspace clustering aims to cluster data lying in a union of low-dimensional subspaces.
This paper presents a method called k-Factorization Subspace Clustering (k-FSC) for large-scale subspace clustering.
arXiv Detail & Related papers (2020-12-08T10:34:21Z) - Adaptive Explicit Kernel Minkowski Weighted K-means [1.3535770763481905]
The kernel K-means, which extends K-means into the kernel space, is able to capture nonlinear structures and identify arbitrarily shaped clusters.
This paper proposes a method to combine the advantages of the linear and nonlinear approaches by using driven corresponding approximate finite-dimensional feature maps.
arXiv Detail & Related papers (2020-12-04T19:14:09Z) - Kernel k-Means, By All Means: Algorithms and Strong Consistency [21.013169939337583]
Kernel $k$ clustering is a powerful tool for unsupervised learning of non-linear data.
In this paper, we generalize results leveraging a general family of means to combat sub-optimal local solutions.
Our algorithm makes use of majorization-minimization (MM) to better solve this non-linear separation problem.
arXiv Detail & Related papers (2020-11-12T16:07:18Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Multi-View Spectral Clustering with High-Order Optimal Neighborhood
Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data.
This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix.
Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z) - SimpleMKKM: Simple Multiple Kernel K-means [49.500663154085586]
We propose a simple yet effective multiple kernel clustering algorithm, termed simple multiple kernel k-means (SimpleMKKM)
Our criterion is given by an intractable minimization-maximization problem in the kernel coefficient and clustering partition matrix.
We theoretically analyze the performance of SimpleMKKM in terms of its clustering generalization error.
arXiv Detail & Related papers (2020-05-11T10:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.