A Critique of Self-Expressive Deep Subspace Clustering
- URL: http://arxiv.org/abs/2010.03697v2
- Date: Fri, 19 Mar 2021 20:33:37 GMT
- Title: A Critique of Self-Expressive Deep Subspace Clustering
- Authors: Benjamin D. Haeffele, Chong You, Ren\'e Vidal
- Abstract summary: Subspace clustering is an unsupervised clustering technique designed to cluster data that is supported on a union of linear subspaces.
We show that there are a number of potential flaws with this approach which have not been adequately addressed in prior work.
- Score: 23.971512395191308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Subspace clustering is an unsupervised clustering technique designed to
cluster data that is supported on a union of linear subspaces, with each
subspace defining a cluster with dimension lower than the ambient space. Many
existing formulations for this problem are based on exploiting the
self-expressive property of linear subspaces, where any point within a subspace
can be represented as linear combination of other points within the subspace.
To extend this approach to data supported on a union of non-linear manifolds,
numerous studies have proposed learning an embedding of the original data using
a neural network which is regularized by a self-expressive loss function on the
data in the embedded space to encourage a union of linear subspaces prior on
the data in the embedded space. Here we show that there are a number of
potential flaws with this approach which have not been adequately addressed in
prior work. In particular, we show the model formulation is often ill-posed in
that it can lead to a degenerate embedding of the data, which need not
correspond to a union of subspaces at all and is poorly suited for clustering.
We validate our theoretical results experimentally and also repeat prior
experiments reported in the literature, where we conclude that a significant
portion of the previously claimed performance benefits can be attributed to an
ad-hoc post processing step rather than the deep subspace clustering model.
Related papers
- Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - Unsupervised Manifold Linearizing and Clustering [19.879641608165887]
We propose to optimize the Maximal Coding Reduction metric with respect to both the data representation and a novel doubly cluster membership.
Experiments on CIFAR-10, -20, -100, and TinyImageNet-200 datasets show that the proposed method is much more accurate and scalable than state-of-the-art deep clustering methods.
arXiv Detail & Related papers (2023-01-04T20:08:23Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Beyond Linear Subspace Clustering: A Comparative Study of Nonlinear
Manifold Clustering Algorithms [22.564682739914424]
Subspace clustering is an important unsupervised clustering approach.
We introduce a new taxonomy to classify the state-of-the-art approaches into three categories, namely locality preserving, kernel based, and neural network based.
The detailed analysis of these approaches unfolds potential research directions and unsolved challenges in this field.
arXiv Detail & Related papers (2021-03-19T06:34:34Z) - Joint and Progressive Subspace Analysis (JPSA) with Spatial-Spectral
Manifold Alignment for Semi-Supervised Hyperspectral Dimensionality Reduction [48.73525876467408]
We propose a novel technique for hyperspectral subspace analysis.
The technique is called joint and progressive subspace analysis (JPSA)
Experiments are conducted to demonstrate the superiority and effectiveness of the proposed JPSA on two widely-used hyperspectral datasets.
arXiv Detail & Related papers (2020-09-21T16:29:59Z) - Is an Affine Constraint Needed for Affine Subspace Clustering? [27.00532615975731]
In computer vision applications, the subspaces are linear and subspace clustering methods can be applied directly.
In motion segmentation, the subspaces are affine and an additional affine constraint on the coefficients is often enforced.
This paper shows, both theoretically and empirically, that when the dimension of the ambient space is high relative to the sum of the dimensions of the affine subspaces, the affine constraint has a negligible effect on clustering performance.
arXiv Detail & Related papers (2020-05-08T07:52:17Z) - Stochastic Sparse Subspace Clustering [20.30051592270384]
State-of-the-art subspace clustering methods are based on self-expressive model, which represents each data point as a linear combination of other data points.
We introduce dropout to address the issue of over-segmentation, which is based on randomly dropping out data points.
This leads to a scalable and flexible sparse subspace clustering approach, termed Sparse Subspace Clustering.
arXiv Detail & Related papers (2020-05-04T13:09:17Z) - Learnable Subspace Clustering [76.2352740039615]
We develop a learnable subspace clustering paradigm to efficiently solve the large-scale subspace clustering problem.
The key idea is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces.
To the best of our knowledge, this paper is the first work to efficiently cluster millions of data points among the subspace clustering methods.
arXiv Detail & Related papers (2020-04-09T12:53:28Z) - Robust Self-Supervised Convolutional Neural Network for Subspace
Clustering and Classification [0.10152838128195464]
This paper proposes the robust formulation of the self-supervised convolutional subspace clustering network ($S2$ConvSCN)
In a truly unsupervised training environment, Robust $S2$ConvSCN outperforms its baseline version by a significant amount for both seen and unseen data on four well-known datasets.
arXiv Detail & Related papers (2020-04-03T16:07:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.