Benchmarking the Effectiveness of Classification Algorithms and SVM
Kernels for Dry Beans
- URL: http://arxiv.org/abs/2307.07863v1
- Date: Sat, 15 Jul 2023 18:13:29 GMT
- Title: Benchmarking the Effectiveness of Classification Algorithms and SVM
Kernels for Dry Beans
- Authors: Anant Mehta, Prajit Sengupta, Divisha Garg, Harpreet Singh, Yosi
Shacham Diamand
- Abstract summary: This study analyses different Support Vector Machine (SVM) classification algorithms, namely linear, and radial basis function (RBF)
The analysis is performed on the Dry Bean dataset, with PCA (Principal Component Analysis) conducted as a preprocessing step for dimensionality reduction.
The RBF SVM kernel algorithm achieves the highest Accuracy of 93.34%, Precision of 92.61%, Recall of 92.35% and F1 Score as 91.40%.
- Score: 0.6263481844384227
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Plant breeders and agricultural researchers can increase crop productivity by
identifying desirable features, disease resistance, and nutritional content by
analysing the Dry Bean dataset. This study analyses and compares different
Support Vector Machine (SVM) classification algorithms, namely linear,
polynomial, and radial basis function (RBF), along with other popular
classification algorithms. The analysis is performed on the Dry Bean Dataset,
with PCA (Principal Component Analysis) conducted as a preprocessing step for
dimensionality reduction. The primary evaluation metric used is accuracy, and
the RBF SVM kernel algorithm achieves the highest Accuracy of 93.34%, Precision
of 92.61%, Recall of 92.35% and F1 Score as 91.40%. Along with adept
visualization and empirical analysis, this study offers valuable guidance by
emphasizing the importance of considering different SVM algorithms for complex
and non-linear structured datasets.
Related papers
- Electroencephalogram Emotion Recognition via AUC Maximization [0.0]
Imbalanced datasets pose significant challenges in areas including neuroscience, cognitive science, and medical diagnostics.
This study addresses the issue class imbalance, using the Liking' label in the DEAP dataset as an example.
arXiv Detail & Related papers (2024-08-16T19:08:27Z) - Centralized and Federated Heart Disease Classification Models Using UCI Dataset and their Shapley-value Based Interpretability [0.7234862895932991]
This study benchmarks machine learning algorithms for heart disease classification using the UCI dataset.
Various binary classification algorithms are trained on pooled data, with a support vector machine (SVM) achieving the highest testing accuracy of 83.3%.
arXiv Detail & Related papers (2024-08-12T14:29:54Z) - Automated Classification of Dry Bean Varieties Using XGBoost and SVM Models [0.0]
This paper presents a comparative study on the automated classification of seven different varieties of dry beans using machine learning models.
The XGBoost and SVM models achieved overall correct classification rates of 94.00% and 94.39%, respectively.
This study contributes to the growing body of work on precision agriculture, demonstrating that automated systems can significantly support seed quality control and crop yield optimization.
arXiv Detail & Related papers (2024-08-02T13:05:33Z) - A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data.
We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z) - EKGNet: A 10.96{\mu}W Fully Analog Neural Network for Intra-Patient
Arrhythmia Classification [79.7946379395238]
We present an integrated approach by combining analog computing and deep learning for electrocardiogram (ECG) arrhythmia classification.
We propose EKGNet, a hardware-efficient and fully analog arrhythmia classification architecture that archives high accuracy with low power consumption.
arXiv Detail & Related papers (2023-10-24T02:37:49Z) - Machine Learning-Assisted Pattern Recognition Algorithms for Estimating
Ultimate Tensile Strength in Fused Deposition Modeled Polylactic Acid
Specimens [0.0]
We investigate the application of supervised machine learning algorithms for estimating the Ultimate Tensile Strength (UTS) of Polylactic Acid (PLA) specimens fabricated using the Fused Deposition Modeling (FDM) process.
The primary objective was to assess the accuracy and effectiveness of four distinct supervised classification algorithms, namely Logistic Classification, Gradient Boosting Classification, Decision Tree, and K-Nearest Neighbor.
The results revealed that while the Decision Tree and K-Nearest Neighbor algorithms both achieved an F1 score of 0.71, the KNN algorithm exhibited a higher Area Under the Curve (AUC) score of 0.79, outperforming the other algorithms
arXiv Detail & Related papers (2023-07-13T11:10:22Z) - Making Machine Learning Datasets and Models FAIR for HPC: A Methodology
and Case Study [0.0]
The FAIR Guiding Principles aim to improve the findability, accessibility, interoperability, and reusability of digital content by making them both human and machine actionable.
These principles have not yet been broadly adopted in the domain of machine learning-based program analyses and optimizations for High-Performance Computing.
We design a methodology to make HPC datasets and machine learning models FAIR after investigating existing FAIRness assessment and improvement techniques.
arXiv Detail & Related papers (2022-11-03T18:45:46Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Doing Great at Estimating CATE? On the Neglected Assumptions in
Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading.
We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators.
We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z) - Deep Representational Similarity Learning for analyzing neural
signatures in task-based fMRI dataset [81.02949933048332]
This paper develops Deep Representational Similarity Learning (DRSL), a deep extension of Representational Similarity Analysis (RSA)
DRSL is appropriate for analyzing similarities between various cognitive tasks in fMRI datasets with a large number of subjects.
arXiv Detail & Related papers (2020-09-28T18:30:14Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.