Learning a Self-Expressive Network for Subspace Clustering
- URL: http://arxiv.org/abs/2110.04318v1
- Date: Fri, 8 Oct 2021 18:06:06 GMT
- Title: Learning a Self-Expressive Network for Subspace Clustering
- Authors: Shangzhi Zhang, Chong You, Ren\'e Vidal and Chun-Guang Li
- Abstract summary: We propose a novel framework for subspace clustering, termed Self-Expressive Network (SENet), which employs a properly designed neural network to learn a self-expressive representation of the data.
Our SENet can not only learn the self-expressive coefficients with desired properties on the training data, but also handle out-of-sample data.
In particular, SENet yields highly competitive performance on MNIST, Fashion MNIST and Extended MNIST and state-of-the-art performance on CIFAR-10.
- Score: 15.096251922264281
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: State-of-the-art subspace clustering methods are based on self-expressive
model, which represents each data point as a linear combination of other data
points. However, such methods are designed for a finite sample dataset and lack
the ability to generalize to out-of-sample data. Moreover, since the number of
self-expressive coefficients grows quadratically with the number of data
points, their ability to handle large-scale datasets is often limited. In this
paper, we propose a novel framework for subspace clustering, termed
Self-Expressive Network (SENet), which employs a properly designed neural
network to learn a self-expressive representation of the data. We show that our
SENet can not only learn the self-expressive coefficients with desired
properties on the training data, but also handle out-of-sample data. Besides,
we show that SENet can also be leveraged to perform subspace clustering on
large-scale datasets. Extensive experiments conducted on synthetic data and
real world benchmark data validate the effectiveness of the proposed method. In
particular, SENet yields highly competitive performance on MNIST, Fashion MNIST
and Extended MNIST and state-of-the-art performance on CIFAR-10. The code is
available at https://github.com/zhangsz1998/Self-Expressive-Network.
Related papers
- Generative Expansion of Small Datasets: An Expansive Graph Approach [13.053285552524052]
We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples.
An autoencoder with self-attention layers and optimal transport refines distributional consistency.
Results show comparable performance, demonstrating the model's potential to augment training data effectively.
arXiv Detail & Related papers (2024-06-25T02:59:02Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - What Can be Seen is What You Get: Structure Aware Point Cloud
Augmentation [0.966840768820136]
We present novel point cloud augmentation methods to artificially diversify a dataset.
Our sensor-centric methods keep the data structure consistent with the lidar sensor capabilities.
We show that our methods enable the use of very small datasets, saving annotation time, training time and the associated costs.
arXiv Detail & Related papers (2022-06-20T09:10:59Z) - Infinite Recommendation Networks: A Data-Centric Approach [8.044430277912936]
We leverage the Neural Tangent Kernel to train infinitely-wide neural networks to devise $infty$-AE: an autoencoder with infinitely-wide bottleneck layers.
We also develop Distill-CF for synthesizing tiny, high-fidelity data summaries.
We observe 96-105% of $infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size.
arXiv Detail & Related papers (2022-06-03T00:34:13Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - CvS: Classification via Segmentation For Small Datasets [52.821178654631254]
This paper presents CvS, a cost-effective classifier for small datasets that derives the classification labels from predicting the segmentation maps.
We evaluate the effectiveness of our framework on diverse problems showing that CvS is able to achieve much higher classification results compared to previous methods when given only a handful of examples.
arXiv Detail & Related papers (2021-10-29T18:41:15Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Learning Self-Expression Metrics for Scalable and Inductive Subspace
Clustering [5.587290026368626]
Subspace clustering has established itself as a state-of-the-art approach to clustering high-dimensional data.
We propose a novel metric learning approach to learn instead a subspace affinity function using a siamese neural network architecture.
Our model benefits from a constant number of parameters and a constant-size memory footprint, allowing it to scale to considerably larger datasets.
arXiv Detail & Related papers (2020-09-27T15:40:12Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Federated Visual Classification with Real-World Data Distribution [9.564468846277366]
We characterize the effect real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm.
We introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits.
We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training.
arXiv Detail & Related papers (2020-03-18T07:55:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.