Few-shot Learning for Unsupervised Feature Selection
- URL: http://arxiv.org/abs/2107.00816v1
- Date: Fri, 2 Jul 2021 03:52:51 GMT
- Title: Few-shot Learning for Unsupervised Feature Selection
- Authors: Atsutoshi Kumagai and Tomoharu Iwata and Yasuhiro Fujiwara
- Abstract summary: We propose a few-shot learning method for unsupervised feature selection.
The proposed method can select a subset of relevant features in a target task given a few unlabeled target instances.
We experimentally demonstrate that the proposed method outperforms existing feature selection methods.
- Score: 59.75321498170363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a few-shot learning method for unsupervised feature selection,
which is a task to select a subset of relevant features in unlabeled data.
Existing methods usually require many instances for feature selection. However,
sufficient instances are often unavailable in practice. The proposed method can
select a subset of relevant features in a target task given a few unlabeled
target instances by training with unlabeled instances in multiple source tasks.
Our model consists of a feature selector and decoder. The feature selector
outputs a subset of relevant features taking a few unlabeled instances as input
such that the decoder can reconstruct the original features of unseen instances
from the selected ones. The feature selector uses the Concrete random variables
to select features via gradient descent. To encode task-specific properties
from a few unlabeled instances to the model, the Concrete random variables and
decoder are modeled using permutation-invariant neural networks that take a few
unlabeled instances as input. Our model is trained by minimizing the expected
test reconstruction error given a few unlabeled instances that is calculated
with datasets in source tasks. We experimentally demonstrate that the proposed
method outperforms existing feature selection methods.
Related papers
- Multi-Label Adaptive Batch Selection by Highlighting Hard and Imbalanced Samples [9.360376286221943]
We introduce an adaptive batch selection algorithm tailored to multi-label deep learning models.
Our method converges faster and performs better than random batch selection.
arXiv Detail & Related papers (2024-03-27T02:00:18Z) - A Performance-Driven Benchmark for Feature Selection in Tabular Deep
Learning [131.2910403490434]
Data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones.
Existing benchmarks for tabular feature selection consider classical downstream models, toy synthetic datasets, or do not evaluate feature selectors on the basis of downstream performance.
We construct a challenging feature selection benchmark evaluated on downstream neural networks including transformers.
We also propose an input-gradient-based analogue of Lasso for neural networks that outperforms classical feature selection methods on challenging problems.
arXiv Detail & Related papers (2023-11-10T05:26:10Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Learning to Imagine: Diversify Memory for Incremental Learning using
Unlabeled Data [69.30452751012568]
We develop a learnable feature generator to diversify exemplars by adaptively generating diverse counterparts of exemplars.
We introduce semantic contrastive learning to enforce the generated samples to be semantic consistent with exemplars.
Our method does not bring any extra inference cost and outperforms state-of-the-art methods on two benchmarks.
arXiv Detail & Related papers (2022-04-19T15:15:18Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Dynamic Instance-Wise Classification in Correlated Feature Spaces [15.351282873821935]
In a typical machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training.
A new method is proposed that sequentially selects the best feature to evaluate for each test instance individually, and stops the selection process to make a prediction once it determines that no further improvement can be achieved with respect to classification accuracy.
The effectiveness, generalizability, and scalability of the proposed method is illustrated on a variety of real-world datasets from diverse application domains.
arXiv Detail & Related papers (2021-06-08T20:20:36Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Feature Selection from High-Dimensional Data with Very Low Sample Size:
A Cautionary Tale [1.491109220586182]
In classification problems, the purpose of feature selection is to identify a small subset of the original feature set.
This study is a cautionary tale demonstrating why feature selection in such cases may lead to undesirable results.
arXiv Detail & Related papers (2020-08-27T10:00:58Z) - Probabilistic Value Selection for Space Efficient Model [10.109875612945658]
Two probabilistic methods based on information theory's metric are proposed: PVS and P + VS.
Experiment results show that value selection can achieve the balance between accuracy and model size reduction.
arXiv Detail & Related papers (2020-07-09T08:45:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.