A Contrast Based Feature Selection Algorithm for High-dimensional Data
set in Machine Learning
- URL: http://arxiv.org/abs/2401.07482v1
- Date: Mon, 15 Jan 2024 05:32:35 GMT
- Title: A Contrast Based Feature Selection Algorithm for High-dimensional Data
set in Machine Learning
- Authors: Chunxu Cao, Qiang Zhang
- Abstract summary: We propose a novel filter feature selection method, ContrastFS, which selects discriminative features based on the discrepancies features shown between different classes.
We validate effectiveness and efficiency of our approach on several widely studied benchmark datasets, results show that the new method performs favorably with negligible computation.
- Score: 9.596923373834093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature selection is an important process in machine learning and knowledge
discovery. By selecting the most informative features and eliminating
irrelevant ones, the performance of learning algorithms can be improved and the
extraction of meaningful patterns and insights from data can be facilitated.
However, most existing feature selection methods, when applied to large
datasets, encountered the bottleneck of high computation costs. To address this
problem, we propose a novel filter feature selection method, ContrastFS, which
selects discriminative features based on the discrepancies features shown
between different classes. We introduce a dimensionless quantity as a surrogate
representation to summarize the distributional individuality of certain
classes, based on this quantity we evaluate features and study the correlation
among them. We validate effectiveness and efficiency of our approach on several
widely studied benchmark datasets, results show that the new method performs
favorably with negligible computation in comparison with other state-of-the-art
feature selection methods.
Related papers
- Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses.
Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Feature Selection: A perspective on inter-attribute cooperation [0.0]
High-dimensional datasets depict a challenge for learning tasks in data mining and machine learning.
Feature selection is an effective technique in dealing with dimensionality reduction.
This paper presents a survey of the state-of-the-art work on filter feature selection methods assisted by feature intercooperation.
arXiv Detail & Related papers (2023-06-28T21:00:52Z) - Deep Feature Selection Using a Novel Complementary Feature Mask [5.904240881373805]
We deal with feature selection by exploiting the features with less importance scores.
We propose a feature selection framework based on a novel complementary feature mask.
Our method is generic and can be easily integrated into existing deep-learning-based feature selection approaches.
arXiv Detail & Related papers (2022-09-25T18:03:30Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Joint Adaptive Graph and Structured Sparsity Regularization for
Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method.
A subset of optimal features will be selected in group, and the number of selected features will be determined automatically.
Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.