A Scalable Feature Selection and Opinion Miner Using Whale Optimization
Algorithm
- URL: http://arxiv.org/abs/2004.13121v1
- Date: Tue, 21 Apr 2020 01:08:45 GMT
- Title: A Scalable Feature Selection and Opinion Miner Using Whale Optimization
Algorithm
- Authors: Amir Javadpour, Samira Rezaei, Kuan-Ching Li and Guojun Wang
- Abstract summary: Using feature selection techniques not only support to understand data better but also lead to higher speed and also accuracy.
In this article, the Whale Optimization algorithm is considered and applied to the search for the optimum subset of features.
- Score: 6.248184589339059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the fast-growing volume of text documents and reviews in recent years,
current analyzing techniques are not competent enough to meet the users' needs.
Using feature selection techniques not only support to understand data better
but also lead to higher speed and also accuracy. In this article, the Whale
Optimization algorithm is considered and applied to the search for the optimum
subset of features. As known, F-measure is a metric based on precision and
recall that is very popular in comparing classifiers. For the evaluation and
comparison of the experimental results, PART, random tree, random forest, and
RBF network classification algorithms have been applied to the different number
of features. Experimental results show that the random forest has the best
accuracy on 500 features. Keywords: Feature selection, Whale Optimization
algorithm, Selecting optimal, Classification algorithm
Related papers
- An efficient hybrid classification approach for COVID-19 based on Harris
Hawks Optimization and Salp Swarm Optimization [0.0]
This study presents a hybrid binary version of the Harris Hawks Optimization algorithm (HHO) and Salp Swarm Optimization (SSA) for Covid-19 classification.
The proposed algorithm (HHOSSA) achieved 96% accuracy with the SVM, 98% and 98% accuracy with two classifiers.
arXiv Detail & Related papers (2022-12-25T19:52:18Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - A Tent L\'evy Flying Sparrow Search Algorithm for Feature Selection: A
COVID-19 Case Study [1.6436293069942312]
The "Curse of Dimensionality" induced by the rapid development of information science might have a negative impact when dealing with big datasets.
We propose a variant of the sparrow search algorithm (SSA), called Tent L'evy flying sparrow search algorithm (TFSSA)
TFSSA is used to select the best subset of features in the packing pattern for classification purposes.
arXiv Detail & Related papers (2022-09-20T15:12:10Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - RSO: A Novel Reinforced Swarm Optimization Algorithm for Feature
Selection [0.0]
In this paper, we propose a novel feature selection algorithm named Reinforced Swarm Optimization (RSO)
This algorithm embeds the widely used Bee Swarm Optimization (BSO) algorithm along with Reinforcement Learning (RL) to maximize the reward of a superior search agent and punish the inferior ones.
The proposed method is evaluated on 25 widely known UCI datasets containing a perfect blend of balanced and imbalanced data.
arXiv Detail & Related papers (2021-07-29T17:38:04Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Bayesian Algorithm Execution: Estimating Computable Properties of
Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations.
We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output.
On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z) - Towards Feature-Based Performance Regression Using Trajectory Data [0.9281671380673306]
Black-box optimization is a very active area of research, with many new algorithms being developed every year.
The variety of algorithms poses a meta-problem: which algorithm to choose for a given problem at hand?
Past research has shown that per-instance algorithm selection based on exploratory landscape analysis can be an efficient mean to tackle this meta-problem.
arXiv Detail & Related papers (2021-02-10T10:19:13Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.