Bi-Sparse Unsupervised Feature Selection
- URL: http://arxiv.org/abs/2412.16819v1
- Date: Sun, 22 Dec 2024 01:43:58 GMT
- Title: Bi-Sparse Unsupervised Feature Selection
- Authors: Xianchao Xiu, Chenyi Huang, Pan Shang, Wanquan Liu,
- Abstract summary: Unsupervised feature selection (UFS) has become a rising technique for image processing.
In this paper, we introduce a novel bi-sparse UFS method, called BSUFS, to simultaneously satisfy both global and local structures.
Extensive numerical experiments on synthetic and real-world datasets demonstrate the effectiveness of our proposed BSUFS.
- Score: 9.541908550559361
- License:
- Abstract: To efficiently deal with high-dimensional datasets in many areas, unsupervised feature selection (UFS) has become a rising technique for dimension reduction. Even though there are many UFS methods, most of them only consider the global structure of datasets by embedding a single sparse regularization or constraint. In this paper, we introduce a novel bi-sparse UFS method, called BSUFS, to simultaneously characterize both global and local structures. The core idea of BSUFS is to incorporate $\ell_{2,p}$-norm and $\ell_q$-norm into the classical principal component analysis (PCA), which enables our proposed method to select relevant features and filter out irrelevant noise accurately. Here, the parameters $p$ and $q$ are within the range of [0,1). Therefore, BSUFS not only constructs a unified framework for bi-sparse optimization, but also includes some existing works as special cases. To solve the resulting non-convex model, we propose an efficient proximal alternating minimization (PAM) algorithm using Riemannian manifold optimization and sparse optimization techniques. Theoretically, PAM is proven to have global convergence, i.e., for any random initial point, the generated sequence converges to a critical point that satisfies the first-order optimality condition. Extensive numerical experiments on synthetic and real-world datasets demonstrate the effectiveness of our proposed BSUFS. Specifically, the average accuracy (ACC) is improved by at least 4.71% and the normalized mutual information (NMI) is improved by at least 3.14% on average compared to the existing UFS competitors. The results validate the advantages of bi-sparse optimization in feature selection and show its potential for other fields in image processing. Our code will be available at https://github.com/xianchaoxiu.
Related papers
- Enhancing Unsupervised Feature Selection via Double Sparsity Constrained Optimization [6.342485512772862]
Unsupervised single feature selection (UFS) is widely applied in machine learning and pattern recognition.
Most of the existing methods only consider sparsity, which makes it difficult to select subsets and discriminative from the original.
In this paper, we propose a new method called DSCOFS to consider a subset and discriminative from the original.
arXiv Detail & Related papers (2025-01-01T05:05:46Z) - Optimizing Posterior Samples for Bayesian Optimization via Rootfinding [2.94944680995069]
We introduce an efficient global optimization strategy for posterior samples based on global rootfinding.
Remarkably, even with just one point from each set, the global optimum is discovered most of the time.
Our approach also improves the performance of other posterior sample-based acquisition functions, such as variants of entropy search.
arXiv Detail & Related papers (2024-10-29T17:57:16Z) - DFedADMM: Dual Constraints Controlled Model Inconsistency for
Decentralized Federated Learning [52.83811558753284]
Decentralized learning (DFL) discards the central server and establishes a decentralized communication network.
Existing DFL methods still suffer from two major challenges: local inconsistency and local overfitting.
arXiv Detail & Related papers (2023-08-16T11:22:36Z) - Learning Regions of Interest for Bayesian Optimization with Adaptive
Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest.
We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z) - Large-scale Optimization of Partial AUC in a Range of False Positive
Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning.
We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique.
Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - Stochastic Optimization of Areas Under Precision-Recall Curves with
Provable Convergence [66.83161885378192]
Area under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems.
We propose a technical method to optimize AUPRC for deep learning.
arXiv Detail & Related papers (2021-04-18T06:22:21Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Practical Bayesian Optimization of Objectives with Conditioning
Variables [1.0497128347190048]
We consider the more general case where a user is faced with multiple problems that each need to be optimized conditional on a state variable.
Similarity across objectives boosts optimization of each objective in two ways.
We propose a framework for conditional optimization: ConBO.
arXiv Detail & Related papers (2020-02-23T22:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.