Related papers: A Scalable Feature Selection and Opinion Miner Using Whale Optimization Algorithm

A Scalable Feature Selection and Opinion Miner Using Whale Optimization Algorithm

URL: http://arxiv.org/abs/2004.13121v1
Date: Tue, 21 Apr 2020 01:08:45 GMT
Title: A Scalable Feature Selection and Opinion Miner Using Whale Optimization Algorithm
Authors: Amir Javadpour, Samira Rezaei, Kuan-Ching Li and Guojun Wang
Abstract summary: Using feature selection techniques not only support to understand data better but also lead to higher speed and also accuracy. In this article, the Whale Optimization algorithm is considered and applied to the search for the optimum subset of features.
Score: 6.248184589339059
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Due to the fast-growing volume of text documents and reviews in recent years, current analyzing techniques are not competent enough to meet the users' needs. Using feature selection techniques not only support to understand data better but also lead to higher speed and also accuracy. In this article, the Whale Optimization algorithm is considered and applied to the search for the optimum subset of features. As known, F-measure is a metric based on precision and recall that is very popular in comparing classifiers. For the evaluation and comparison of the experimental results, PART, random tree, random forest, and RBF network classification algorithms have been applied to the different number of features. Experimental results show that the random forest has the best accuracy on 500 features. Keywords: Feature selection, Whale Optimization algorithm, Selecting optimal, Classification algorithm

Related papers

An efficient hybrid classification approach for COVID-19 based on Harris Hawks Optimization and Salp Swarm Optimization [0.0]
This study presents a hybrid binary version of the Harris Hawks Optimization algorithm (HHO) and Salp Swarm Optimization (SSA) for Covid-19 classification. The proposed algorithm (HHOSSA) achieved 96% accuracy with the SVM, 98% and 98% accuracy with two classifiers.
arXiv Detail & Related papers (2022-12-25T19:52:18Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z)
A Tent L\'evy Flying Sparrow Search Algorithm for Feature Selection: A COVID-19 Case Study [1.6436293069942312]
The "Curse of Dimensionality" induced by the rapid development of information science might have a negative impact when dealing with big datasets. We propose a variant of the sparrow search algorithm (SSA), called Tent L'evy flying sparrow search algorithm (TFSSA) TFSSA is used to select the best subset of features in the packing pattern for classification purposes.
arXiv Detail & Related papers (2022-09-20T15:12:10Z)
Compactness Score: A Fast Filter Method for Unsupervised Feature Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features. Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z)
RSO: A Novel Reinforced Swarm Optimization Algorithm for Feature Selection [0.0]
In this paper, we propose a novel feature selection algorithm named Reinforced Swarm Optimization (RSO) This algorithm embeds the widely used Bee Swarm Optimization (BSO) algorithm along with Reinforcement Learning (RL) to maximize the reward of a superior search agent and punish the inferior ones. The proposed method is evaluated on 25 widely known UCI datasets containing a perfect blend of balanced and imbalanced data.
arXiv Detail & Related papers (2021-07-29T17:38:04Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information [78.78486761923855]
In many real world problems, we want to infer some property of an expensive black-box function f, given a budget of T function evaluations. We present a procedure, InfoBAX, that sequentially chooses queries that maximize mutual information with respect to the algorithm's output. On these problems, InfoBAX uses up to 500 times fewer queries to f than required by the original algorithm.
arXiv Detail & Related papers (2021-04-19T17:22:11Z)
Towards Feature-Based Performance Regression Using Trajectory Data [0.9281671380673306]
Black-box optimization is a very active area of research, with many new algorithms being developed every year. The variety of algorithms poses a meta-problem: which algorithm to choose for a given problem at hand? Past research has shown that per-instance algorithm selection based on exploratory landscape analysis can be an efficient mean to tackle this meta-problem.
arXiv Detail & Related papers (2021-02-10T10:19:13Z)
Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm. We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.