Classic algorithms are fair learners: Classification Analysis of natural
weather and wildfire occurrences
- URL: http://arxiv.org/abs/2309.01381v1
- Date: Mon, 4 Sep 2023 06:11:55 GMT
- Title: Classic algorithms are fair learners: Classification Analysis of natural
weather and wildfire occurrences
- Authors: Senthilkumar Gopal
- Abstract summary: This paper reviews the empirical functioning of widely used classical supervised learning algorithms such as Decision Trees, Boosting, Support Vector Machines, k-nearest Neighbors and a shallow Artificial Neural Network.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classic machine learning algorithms have been reviewed and studied
mathematically on its performance and properties in detail. This paper intends
to review the empirical functioning of widely used classical supervised
learning algorithms such as Decision Trees, Boosting, Support Vector Machines,
k-nearest Neighbors and a shallow Artificial Neural Network. The paper
evaluates these algorithms on a sparse tabular data for classification task and
observes the effect on specific hyperparameters on these algorithms when the
data is synthetically modified for higher noise. These perturbations were
introduced to observe these algorithms on their efficiency in generalizing for
sparse data and their utility of different parameters to improve classification
accuracy. The paper intends to show that these classic algorithms are fair
learners even for such limited data due to their inherent properties even for
noisy and sparse datasets.
Related papers
- Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A
Benchmarking Study [0.6291443816903801]
This paper evaluates a diverse array of machine learning-based anomaly detection algorithms.
The paper contributes significantly by conducting an unbiased comparison of various anomaly detection algorithms.
arXiv Detail & Related papers (2024-02-11T19:12:51Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Performance Analysis of Fractional Learning Algorithms [32.21539962359158]
It is unclear whether the proclaimed superiority over conventional algorithms is well-grounded or is a myth as their performance has never been extensively analyzed.
In this article, a rigorous analysis of fractional variants of the least mean squares and steepest descent algorithms is performed.
Their origins and consequences on the performance of the learning algorithms are discussed and swift ready-witted remedies are proposed.
arXiv Detail & Related papers (2021-10-11T12:06:44Z) - A Framework and Benchmarking Study for Counterfactual Generating Methods
on Tabular Data [0.0]
Counterfactual explanations are viewed as an effective way to explain machine learning predictions.
There are already dozens of algorithms aiming to generate such explanations.
benchmarking study and framework can help practitioners in determining which technique and building blocks most suit their context.
arXiv Detail & Related papers (2021-07-09T21:06:03Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - DAC: Deep Autoencoder-based Clustering, a General Deep Learning
Framework of Representation Learning [0.0]
We propose DAC, Deep Autoencoder-based Clustering, a data-driven framework to learn clustering representations using deep neuron networks.
Experiment results show that our approach could effectively boost performance of the KMeans clustering algorithm on a variety of datasets.
arXiv Detail & Related papers (2021-02-15T11:31:00Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra [53.46106569419296]
We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression.
We argue that the previous quantum-inspired algorithms for these problems are doing leverage or ridge-leverage score sampling in disguise.
arXiv Detail & Related papers (2020-11-09T01:13:07Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Large-scale empirical validation of Bayesian Network structure learning
algorithms with noisy data [9.04391541965756]
This paper investigates the performance of 15 structure learning algorithms.
Each algorithm is tested over multiple case studies, sample sizes, types of noise, and assessed with multiple evaluation criteria.
Results suggest traditional synthetic performance may overestimate real-world performance by anywhere between 10% and more than 50%.
arXiv Detail & Related papers (2020-05-18T18:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.