A-SFS: Semi-supervised Feature Selection based on Multi-task
Self-supervision
- URL: http://arxiv.org/abs/2207.09061v1
- Date: Tue, 19 Jul 2022 04:22:27 GMT
- Title: A-SFS: Semi-supervised Feature Selection based on Multi-task
Self-supervision
- Authors: Zhifeng Qiu, Wanxin Zeng, Dahua Liao, Ning Gui
- Abstract summary: We introduce a deep learning-based self-supervised mechanism into feature selection problems.
A batch-attention mechanism is designed to generate feature weights according to batch-based feature selection patterns.
Experimental results show that A-SFS achieves the highest accuracy in most datasets.
- Score: 1.3190581566723918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature selection is an important process in machine learning. It builds an
interpretable and robust model by selecting the features that contribute the
most to the prediction target. However, most mature feature selection
algorithms, including supervised and semi-supervised, fail to fully exploit the
complex potential structure between features. We believe that these structures
are very important for the feature selection process, especially when labels
are lacking and data is noisy.
To this end, we innovatively introduce a deep learning-based self-supervised
mechanism into feature selection problems, namely batch-Attention-based
Self-supervision Feature Selection(A-SFS). Firstly, a multi-task
self-supervised autoencoder is designed to uncover the hidden structure among
features with the support of two pretext tasks. Guided by the integrated
information from the multi-self-supervised learning model, a batch-attention
mechanism is designed to generate feature weights according to batch-based
feature selection patterns to alleviate the impacts introduced by a handful of
noisy data. This method is compared to 14 major strong benchmarks, including
LightGBM and XGBoost. Experimental results show that A-SFS achieves the highest
accuracy in most datasets. Furthermore, this design significantly reduces the
reliance on labels, with only 1/10 labeled data needed to achieve the same
performance as those state of art baselines. Results show that A-SFS is also
most robust to the noisy and missing data.
Related papers
- Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.
Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - MvFS: Multi-view Feature Selection for Recommender System [7.0190343591422115]
We propose Multi-view Feature Selection (MvFS), which selects informative features for each instance more effectively.
MvFS employs a multi-view network consisting of multiple sub-networks, each of which learns to measure the feature importance of a part of data.
MvFS adopts an effective importance score modeling strategy which is applied independently to each field.
arXiv Detail & Related papers (2023-09-05T09:06:34Z) - FedSDG-FS: Efficient and Secure Feature Selection for Vertical Federated
Learning [21.79965380400454]
Vertical Learning (VFL) enables multiple data owners, each holding a different subset of features about largely overlapping sets of data sample(s) to jointly train a useful global model.
Feature selection (FS) is important to VFL. It is still an open research problem as existing FS works designed for VFL either assumes prior knowledge on the number of noisy features or prior knowledge on the post-training threshold of useful features.
We propose the Federated Dual-Gate based Feature Selection (FedSDG-FS) approach. It consists of a Gaussian dual-gate to efficiently approximate the probability of a feature being selected, with privacy
arXiv Detail & Related papers (2023-02-21T03:09:45Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - A User-Guided Bayesian Framework for Ensemble Feature Selection in Life
Science Applications (UBayFS) [0.0]
We propose UBayFS, an ensemble feature selection technique, embedded in a Bayesian statistical framework.
Our approach enhances the feature selection process by considering two sources of information: data and domain knowledge.
A comparison with standard feature selectors underlines that UBayFS achieves competitive performance, while providing additional flexibility to incorporate domain knowledge.
arXiv Detail & Related papers (2021-04-30T06:51:33Z) - Feature Selection for Huge Data via Minipatch Learning [0.0]
We propose Stable Minipatch Selection (STAMPS) and Adaptive STAMPS.
STAMPS are meta-algorithms that build ensembles of selection events of base feature selectors trained on tiny, (ly-adaptive) random subsets of both the observations and features of the data.
Our approaches are general and can be employed with a variety of existing feature selection strategies and machine learning techniques.
arXiv Detail & Related papers (2020-10-16T17:41:08Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - Feature Selection Library (MATLAB Toolbox) [1.2058143465239939]
The Feature Selection Library (FSLib) introduces a comprehensive suite of feature selection (FS) algorithms.
FSLib addresses the curse of dimensionality, reduces computational load, and enhances model generalizability.
FSLib contributes to data interpretability by revealing important features, aiding in pattern recognition and understanding.
arXiv Detail & Related papers (2016-07-05T16:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.