Meta-learning representations for clustering with infinite Gaussian
mixture models
- URL: http://arxiv.org/abs/2103.00694v1
- Date: Mon, 1 Mar 2021 02:05:31 GMT
- Title: Meta-learning representations for clustering with infinite Gaussian
mixture models
- Authors: Tomoharu Iwata
- Abstract summary: We propose a meta-learning method that train neural networks for obtaining representations such that clustering performance improves.
The proposed method can cluster unseen unlabeled data using knowledge meta-learned with labeled data that are different from the unlabeled data.
- Score: 39.56814839510978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For better clustering performance, appropriate representations are critical.
Although many neural network-based metric learning methods have been proposed,
they do not directly train neural networks to improve clustering performance.
We propose a meta-learning method that train neural networks for obtaining
representations such that clustering performance improves when the
representations are clustered by the variational Bayesian (VB) inference with
an infinite Gaussian mixture model. The proposed method can cluster unseen
unlabeled data using knowledge meta-learned with labeled data that are
different from the unlabeled data. For the objective function, we propose a
continuous approximation of the adjusted Rand index (ARI), by which we can
evaluate the clustering performance from soft clustering assignments. Since the
approximated ARI and the VB inference procedure are differentiable, we can
backpropagate the objective function through the VB inference procedure to
train the neural networks. With experiments using text and image data sets, we
demonstrate that our proposed method has a higher adjusted Rand index than
existing methods do.
Related papers
- Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition.
Recent data-driven approaches have improved the graph matching accuracy remarkably.
We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z) - Nonlinear subspace clustering by functional link neural networks [20.972039615938193]
Subspace clustering based on a feed-forward neural network has been demonstrated to provide better clustering accuracy than some advanced subspace clustering algorithms.
We employ a functional link neural network to transform data samples into a nonlinear domain.
We introduce a convex combination subspace clustering scheme, which combines a linear subspace clustering method with the functional link neural network subspace clustering approach.
arXiv Detail & Related papers (2024-02-03T06:01:21Z) - Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network [0.36122488107441414]
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance.
We propose a non- parameter-sharing approach for group equivariant neural networks.
The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of decomposedally augmented filters.
arXiv Detail & Related papers (2023-05-17T10:18:02Z) - A One-shot Framework for Distributed Clustered Learning in Heterogeneous
Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments.
One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees.
For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z) - A Framework and Benchmark for Deep Batch Active Learning for Regression [2.093287944284448]
We study active learning methods that adaptively select batches of unlabeled data for labeling.
We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods.
Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code.
arXiv Detail & Related papers (2022-03-17T16:11:36Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Variational Auto Encoder Gradient Clustering [0.0]
Clustering using deep neural network models have been extensively studied in recent years.
This article investigates how probability function gradient ascent can be used to process data in order to achieve better clustering.
We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm.
arXiv Detail & Related papers (2021-05-11T08:00:36Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Improving k-Means Clustering Performance with Disentangled Internal
Representations [0.0]
We propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder.
Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.
arXiv Detail & Related papers (2020-06-05T11:32:34Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.