Provable Data Clustering via Innovation Search
- URL: http://arxiv.org/abs/2108.06888v1
- Date: Mon, 16 Aug 2021 04:16:38 GMT
- Title: Provable Data Clustering via Innovation Search
- Authors: Weiwei Li, Mostafa Rahmani, Ping Li
- Abstract summary: This paper studies the subspace clustering problem in which data points collected from high-dimensional ambient space lie in a union of linear subspaces.
A recently proposed clustering method termed Innovation Pursuit, computed a set of optimal directions to build the adjacency matrix.
- Score: 32.530406356676274
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies the subspace clustering problem in which data points
collected from high-dimensional ambient space lie in a union of linear
subspaces. Subspace clustering becomes challenging when the dimension of
intersection between subspaces is large and most of the self-representation
based methods are sensitive to the intersection between the span of clusters.
In sharp contrast to the self-representation based methods, a recently proposed
clustering method termed Innovation Pursuit, computed a set of optimal
directions (directions of innovation) to build the adjacency matrix. This paper
focuses on the Innovation Pursuit Algorithm to shed light on its impressive
performance when the subspaces are heavily intersected. It is shown that in
contrast to most of the existing methods which require the subspaces to be
sufficiently incoherent with each other, Innovation Pursuit only requires the
innovative components of the subspaces to be sufficiently incoherent with each
other. These new sufficient conditions allow the clusters to be strongly close
to each other. Motivated by the presented theoretical analysis, a simple yet
effective projection based technique is proposed which we show with both
numerical and theoretical results that it can boost the performance of
Innovation Pursuit.
Related papers
- Consistent Spectral Clustering in Hyperbolic Spaces [16.75089998678061]
We propose a spectral clustering algorithm on Hyperbolic Spaces to represent complex data structures.
We show that our algorithm converges at least as fast as Spectral Clustering on Euclidean Spaces.
This work opens up avenues for utilizing non-Euclidean Spaces in clustering algorithms.
arXiv Detail & Related papers (2024-09-14T04:54:31Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Enriched Robust Multi-View Kernel Subspace Clustering [5.770309971945476]
Subspace clustering is to find underlying low-dimensional subspaces and cluster the data points correctly.
Most existing methods suffer from two critical issues.
We propose a novel multi-view subspace clustering method.
arXiv Detail & Related papers (2022-05-21T03:06:24Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Weighted Sparse Subspace Representation: A Unified Framework for
Subspace Clustering, Constrained Clustering, and Active Learning [0.3553493344868413]
We first propose a novel spectral-based subspace clustering algorithm that seeks to represent each point as a sparse convex combination of a few nearby points.
We then extend the algorithm to constrained clustering and active learning settings.
Our motivation for developing such a framework stems from the fact that typically either a small amount of labelled data is available in advance; or it is possible to label some points at a cost.
arXiv Detail & Related papers (2021-06-08T13:39:43Z) - Beyond Linear Subspace Clustering: A Comparative Study of Nonlinear
Manifold Clustering Algorithms [22.564682739914424]
Subspace clustering is an important unsupervised clustering approach.
We introduce a new taxonomy to classify the state-of-the-art approaches into three categories, namely locality preserving, kernel based, and neural network based.
The detailed analysis of these approaches unfolds potential research directions and unsolved challenges in this field.
arXiv Detail & Related papers (2021-03-19T06:34:34Z) - Clustering Ensemble Meets Low-rank Tensor Approximation [50.21581880045667]
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one.
We propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective.
Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods.
arXiv Detail & Related papers (2020-12-16T13:01:37Z) - A Critique of Self-Expressive Deep Subspace Clustering [23.971512395191308]
Subspace clustering is an unsupervised clustering technique designed to cluster data that is supported on a union of linear subspaces.
We show that there are a number of potential flaws with this approach which have not been adequately addressed in prior work.
arXiv Detail & Related papers (2020-10-08T00:14:59Z) - Joint and Progressive Subspace Analysis (JPSA) with Spatial-Spectral
Manifold Alignment for Semi-Supervised Hyperspectral Dimensionality Reduction [48.73525876467408]
We propose a novel technique for hyperspectral subspace analysis.
The technique is called joint and progressive subspace analysis (JPSA)
Experiments are conducted to demonstrate the superiority and effectiveness of the proposed JPSA on two widely-used hyperspectral datasets.
arXiv Detail & Related papers (2020-09-21T16:29:59Z) - Unsupervised Multi-view Clustering by Squeezing Hybrid Knowledge from
Cross View and Each View [68.88732535086338]
This paper proposes a new multi-view clustering method, low-rank subspace multi-view clustering based on adaptive graph regularization.
Experimental results for five widely used multi-view benchmarks show that our proposed algorithm surpasses other state-of-the-art methods by a clear margin.
arXiv Detail & Related papers (2020-08-23T08:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.