Learnable Similarity and Dissimilarity Guided Symmetric Non-Negative Matrix Factorization
- URL: http://arxiv.org/abs/2412.04082v1
- Date: Thu, 05 Dec 2024 11:32:53 GMT
- Title: Learnable Similarity and Dissimilarity Guided Symmetric Non-Negative Matrix Factorization
- Authors: Wenlong Lyu, Yuheng Jia,
- Abstract summary: We construct a weighted $k$-NN graph with learnable weight that reflects the reliability of each $k$-th NN.
To obtain a discriminative similarity matrix, we introduce a dissimilarity matrix with a dual structure of the similarity matrix.
An efficient alternative optimization algorithm is designed to solve the proposed model.
- Score: 18.53944578996308
- License:
- Abstract: Symmetric nonnegative matrix factorization (SymNMF) is a powerful tool for clustering, which typically uses the $k$-nearest neighbor ($k$-NN) method to construct similarity matrix. However, $k$-NN may mislead clustering since the neighbors may belong to different clusters, and its reliability generally decreases as $k$ grows. In this paper, we construct the similarity matrix as a weighted $k$-NN graph with learnable weight that reflects the reliability of each $k$-th NN. This approach reduces the search space of the similarity matrix learning to $n - 1$ dimension, as opposed to the $\mathcal{O}(n^2)$ dimension of existing methods, where $n$ represents the number of samples. Moreover, to obtain a discriminative similarity matrix, we introduce a dissimilarity matrix with a dual structure of the similarity matrix, and propose a new form of orthogonality regularization with discussions on its geometric interpretation and numerical stability. An efficient alternative optimization algorithm is designed to solve the proposed model, with theoretically guarantee that the variables converge to a stationary point that satisfies the KKT conditions. The advantage of the proposed model is demonstrated by the comparison with nine state-of-the-art clustering methods on eight datasets. The code is available at \url{https://github.com/lwl-learning/LSDGSymNMF}.
Related papers
- A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization [83.12938977698988]
Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data.
Current approaches inadequately address the intrinsic optimization of the co-occurrence matrix $barA$ based on cosine similarity.
We propose a Non-Negative Generalized Category Discovery (NN-GCD) framework to address these deficiencies.
arXiv Detail & Related papers (2024-10-29T07:24:11Z) - Semi-supervised Symmetric Non-negative Matrix Factorization with Low-Rank Tensor Representation [27.14442336413482]
Semi-supervised symmetric non-negative matrix factorization (SNMF)
We propose a novel SNMF model by seeking low-rank representation for the tensor synthesized by the pairwise constraint matrix.
We then propose an enhanced SNMF model, making the embedding matrix tailored to the above tensor low-rank representation.
arXiv Detail & Related papers (2024-05-04T14:58:47Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Optimal Variable Clustering for High-Dimensional Matrix Valued Data [3.1138411427556445]
We propose a new latent variable model for the features arranged in matrix form.
Under mild conditions, our algorithm attains clustering consistency in the high-dimensional setting.
We identify the optimal weight in the sense that using this weight guarantees our algorithm to be minimax rate-optimal.
arXiv Detail & Related papers (2021-12-24T02:13:04Z) - Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting.
Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix.
We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z) - Optimal N-ary ECOC Matrices for Ensemble Classification [1.3561997774592662]
A new construction of $N$-ary error-correcting output code (ECOC) matrices for ensemble classification methods is presented.
Given any prime integer $N$, this deterministic construction generates base-$N$ symmetric square matrices $M$ of prime-power dimension having optimal minimum Hamming distance between any two of its rows and columns.
arXiv Detail & Related papers (2021-10-05T16:50:15Z) - Self-supervised Symmetric Nonnegative Matrix Factorization [82.59905231819685]
Symmetric nonnegative factor matrix (SNMF) has demonstrated to be a powerful method for data clustering.
Inspired by ensemble clustering that aims to seek better clustering results, we propose self-supervised SNMF (S$3$NMF)
We take advantage of the sensitivity to code characteristic of SNMF, without relying on any additional information.
arXiv Detail & Related papers (2021-03-02T12:47:40Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.