Unsupervised Feature Selection via Robust Autoencoder and Adaptive Graph Learning
- URL: http://arxiv.org/abs/2512.18720v1
- Date: Sun, 21 Dec 2025 12:42:37 GMT
- Title: Unsupervised Feature Selection via Robust Autoencoder and Adaptive Graph Learning
- Authors: Feng Yu, MD Saifur Rahman Mazumder, Ying Su, Oscar Contreras Velasco,
- Abstract summary: Unsupervised feature selection (UFS) aims to simultaneously cluster data and identify the most discriminative features.<n>We propose the Robust Autoencoder-based Unsupervised Feature Selection (RAEUFS) model, which leverages a deep autoencoder to learn nonlinear feature representations.<n>Our method outperforms state-of-the-art UFS approaches in both clean and outlier-contaminated data settings.
- Score: 4.371255245299254
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Effective feature selection is essential for high-dimensional data analysis and machine learning. Unsupervised feature selection (UFS) aims to simultaneously cluster data and identify the most discriminative features. Most existing UFS methods linearly project features into a pseudo-label space for clustering, but they suffer from two critical limitations: (1) an oversimplified linear mapping that fails to capture complex feature relationships, and (2) an assumption of uniform cluster distributions, ignoring outliers prevalent in real-world data. To address these issues, we propose the Robust Autoencoder-based Unsupervised Feature Selection (RAEUFS) model, which leverages a deep autoencoder to learn nonlinear feature representations while inherently improving robustness to outliers. We further develop an efficient optimization algorithm for RAEUFS. Extensive experiments demonstrate that our method outperforms state-of-the-art UFS approaches in both clean and outlier-contaminated data settings.
Related papers
- Bi-Level Unsupervised Feature Selection [11.383408944117804]
We propose a novel bi-level unsupervised feature selection (BLUFS) method, including a clustering level and a feature level.<n>At the clustering level, spectral clustering is used to generate pseudo-labels for representing the data structure, while a continuous linear regression model is developed to learn the projection matrix.<n>At the feature level, the $ell_2,0$-norm constraint is imposed on the projection matrix for more effectively selecting features.
arXiv Detail & Related papers (2025-05-26T22:52:31Z) - K-means Derived Unsupervised Feature Selection using Improved ADMM [25.145984747164256]
This paper presents a novel method called K-means Derived Unsupervised Feature Selection (K-means UFS)
Unlike most existing spectral analysis based unsupervised feature selection methods, we select features using the objective of K-means.
Experiments on real datasets show that our K-means UFS is more effective than the baselines in selecting features for clustering.
arXiv Detail & Related papers (2024-11-19T18:05:02Z) - End-to-end Learnable Clustering for Intent Learning in Recommendation [54.157784572994316]
We propose a novel intent learning method termed underlineELCRec.
It unifies behavior representation learning into an underlineEnd-to-end underlineLearnable underlineClustering framework.
We deploy this method on the industrial recommendation system with 130 million page views and achieve promising results.
arXiv Detail & Related papers (2024-01-11T15:22:55Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - A-SFS: Semi-supervised Feature Selection based on Multi-task
Self-supervision [1.3190581566723918]
We introduce a deep learning-based self-supervised mechanism into feature selection problems.
A batch-attention mechanism is designed to generate feature weights according to batch-based feature selection patterns.
Experimental results show that A-SFS achieves the highest accuracy in most datasets.
arXiv Detail & Related papers (2022-07-19T04:22:27Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC)
We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss.
Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Adaptive Graph-based Generalized Regression Model for Unsupervised
Feature Selection [11.214334712819396]
How to select the uncorrelated and discriminative features is the key problem of unsupervised feature selection.
We present a novel generalized regression model imposed by an uncorrelated constraint and the $ell_2,1$-norm regularization.
It can simultaneously select the uncorrelated and discriminative features as well as reduce the variance of these data points belonging to the same neighborhood.
arXiv Detail & Related papers (2020-12-27T09:07:26Z) - Dual Adversarial Auto-Encoders for Clustering [152.84443014554745]
We propose Dual Adversarial Auto-encoder (Dual-AAE) for unsupervised clustering.
By performing variational inference on the objective function of Dual-AAE, we derive a new reconstruction loss which can be optimized by training a pair of Auto-encoders.
Experiments on four benchmarks show that Dual-AAE achieves superior performance over state-of-the-art clustering methods.
arXiv Detail & Related papers (2020-08-23T13:16:34Z) - Robust Self-Supervised Convolutional Neural Network for Subspace
Clustering and Classification [0.10152838128195464]
This paper proposes the robust formulation of the self-supervised convolutional subspace clustering network ($S2$ConvSCN)
In a truly unsupervised training environment, Robust $S2$ConvSCN outperforms its baseline version by a significant amount for both seen and unseen data on four well-known datasets.
arXiv Detail & Related papers (2020-04-03T16:07:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.