Fast Classification with Sequential Feature Selection in Test Phase
- URL: http://arxiv.org/abs/2306.14347v1
- Date: Sun, 25 Jun 2023 21:31:46 GMT
- Title: Fast Classification with Sequential Feature Selection in Test Phase
- Authors: Ali Mirzaei, Vahid Pourahmadi, Hamid Sheikhzadeh, Alireza
Abdollahpourrostam
- Abstract summary: This paper introduces a novel approach to active feature acquisition for classification.
It is the task of sequentially selecting the most informative subset of features to achieve optimal prediction performance.
The proposed approach involves a new lazy model that is significantly faster and more efficient compared to existing methods.
- Score: 1.1470070927586016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a novel approach to active feature acquisition for
classification, which is the task of sequentially selecting the most
informative subset of features to achieve optimal prediction performance during
testing while minimizing cost. The proposed approach involves a new lazy model
that is significantly faster and more efficient compared to existing methods,
while still producing comparable accuracy results. During the test phase, the
proposed approach utilizes Fisher scores for feature ranking to identify the
most important feature at each step. In the next step the training dataset is
filtered based on the observed value of the selected feature and then we
continue this process to reach to acceptable accuracy or limit of the budget
for feature acquisition. The performance of the proposed approach was evaluated
on synthetic and real datasets, including our new synthetic dataset, CUBE
dataset and also real dataset Forest. The experimental results demonstrate that
our approach achieves competitive accuracy results compared to existing
methods, while significantly outperforming them in terms of speed. The source
code of the algorithm is released at github with this link:
https://github.com/alimirzaei/FCwSFS.
Related papers
- Towards Free Data Selection with General-Purpose Models [71.92151210413374]
A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets.
Current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly.
FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods.
arXiv Detail & Related papers (2023-09-29T15:50:14Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - A Tent L\'evy Flying Sparrow Search Algorithm for Feature Selection: A
COVID-19 Case Study [1.6436293069942312]
The "Curse of Dimensionality" induced by the rapid development of information science might have a negative impact when dealing with big datasets.
We propose a variant of the sparrow search algorithm (SSA), called Tent L'evy flying sparrow search algorithm (TFSSA)
TFSSA is used to select the best subset of features in the packing pattern for classification purposes.
arXiv Detail & Related papers (2022-09-20T15:12:10Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - A concise method for feature selection via normalized frequencies [0.0]
In this paper, a concise method is proposed for universal feature selection.
The proposed method uses a fusion of the filter method and the wrapper method, rather than a combination of them.
The evaluation results show that the proposed method outperformed several state-of-the-art related works in terms of accuracy, precision, recall, F-score and AUC.
arXiv Detail & Related papers (2021-06-10T15:29:54Z) - RFCBF: enhance the performance and stability of Fast Correlation-Based
Filter [6.781877756322586]
We propose a novel extension of FCBF, called RFCBF, which combines resampling technique to improve classification accuracy.
The experimental results show that the RFCBF algorithm yields significantly better results than previous state-of-the-art methods in terms of classification accuracy and runtime.
arXiv Detail & Related papers (2021-05-30T12:36:32Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z) - Fast Template Matching and Update for Video Object Tracking and
Segmentation [56.465510428878]
The main task we aim to tackle is the multi-instance semi-supervised video object segmentation across a sequence of frames.
The challenges lie in the selection of the matching method to predict the result as well as to decide whether to update the target template.
We propose a novel approach which utilizes reinforcement learning to make these two decisions at the same time.
arXiv Detail & Related papers (2020-04-16T08:58:45Z) - IVFS: Simple and Efficient Feature Selection for High Dimensional
Topology Preservation [33.424663018395684]
We propose a simple and effective feature selection algorithm to enhance sample similarity preservation.
The proposed algorithm is able to well preserve the pairwise distances, as well as topological patterns, of the full data.
arXiv Detail & Related papers (2020-04-02T23:05:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.