Related papers: Behavioral analysis of support vector machine classifier with Gaussian kernel and imbalanced data

Behavioral analysis of support vector machine classifier with Gaussian kernel and imbalanced data

URL: http://arxiv.org/abs/2007.05042v1
Date: Thu, 9 Jul 2020 19:28:25 GMT
Title: Behavioral analysis of support vector machine classifier with Gaussian kernel and imbalanced data
Authors: Alaa Tharwat
Abstract summary: The behavior of the SVM classification model is analyzed when these parameters take different values with balanced and imbalanced data. From this analysis, we proposed a novel search algorithm. In our algorithm, we search for the optimal SVM parameters into two one-dimensional spaces instead of searching into one two-dimensional space.
Score: 6.028247638616058
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The parameters of support vector machines (SVMs) such as the penalty parameter and the kernel parameters have a great impact on the classification accuracy and the complexity of the SVM model. Therefore, the model selection in SVM involves the tuning of these parameters. However, these parameters are usually tuned and used as a black box, without understanding the mathematical background or internal details. In this paper, the behavior of the SVM classification model is analyzed when these parameters take different values with balanced and imbalanced data. This analysis including visualization, mathematical and geometrical interpretations and illustrative numerical examples with the aim of providing the basics of the Gaussian and linear kernel functions with SVM. From this analysis, we proposed a novel search algorithm. In this algorithm, we search for the optimal SVM parameters into two one-dimensional spaces instead of searching into one two-dimensional space. This reduces the computational time significantly. Moreover, in our algorithm, from the analysis of the data, the range of kernel function can be expected. This also reduces the search space and hence reduces the required computational time. Different experiments were conducted to evaluate our search algorithm using different balanced and imbalanced datasets. The results demonstrated how the proposed strategy is fast and effective than other searching strategies.

Related papers

Multi-Class Imbalanced Learning with Support Vector Machines via Differential Evolution [4.877822002065298]
Support vector machine (SVM) is a powerful machine learning algorithm to handle classification tasks. In this paper, we propose an improved SVM via Differential Evolution (i-SVM-DE) method to deal with it.
arXiv Detail & Related papers (2025-02-20T14:30:18Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
A Framework for Fast and Stable Representations of Multiparameter Persistent Homology Decompositions [2.76240219662896]
We introduce a new general representation framework that leverages recent results on em decompositions of multi parameter persistent homology. We establish theoretical stability guarantees under this framework as well as efficient algorithms for practical computation. We validate our stability results and algorithms with numerical experiments that demonstrate statistical convergence, prediction accuracy, and fast running times on several real data sets.
arXiv Detail & Related papers (2023-06-19T21:28:53Z)
Separability and Scatteredness (S&S) Ratio-Based Efficient SVM Regularization Parameter, Kernel, and Kernel Parameter Selection [10.66048003460524]
Support Vector Machine (SVM) is a robust machine learning algorithm with broad applications in classification, regression, and outlier detection. This work shows that the SVM performance can be modeled as a function of separability and scatteredness (S&S) of the data.
arXiv Detail & Related papers (2023-05-17T13:51:43Z)
New Equivalences Between Interpolation and SVMs: Kernels and Structured Features [22.231455330003328]
We present a new and flexible analysis framework for proving SVP in an arbitrary kernel reproducing Hilbert space with a flexible class of generative models for the labels. We show that SVP occurs in many interesting settings not covered by prior work, and we leverage these results to prove novel generalization results for kernel SVM classification.
arXiv Detail & Related papers (2023-05-03T17:52:40Z)
Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression. It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise. This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z)
Weight Vector Tuning and Asymptotic Analysis of Binary Linear Classifiers [82.5915112474988]
This paper proposes weight vector tuning of a generic binary linear classifier through the parameterization of a decomposition of the discriminant by a scalar. It is also found that weight vector tuning significantly improves the performance of Linear Discriminant Analysis (LDA) under high estimation noise.
arXiv Detail & Related papers (2021-10-01T17:50:46Z)
Quantum Algorithms for Data Representation and Analysis [68.754953879193]
We provide quantum procedures that speed-up the solution of eigenproblems for data representation in machine learning. The power and practical use of these subroutines is shown through new quantum algorithms, sublinear in the input matrix's size, for principal component analysis, correspondence analysis, and latent semantic analysis. Results show that the run-time parameters that do not depend on the input's size are reasonable and that the error on the computed model is small, allowing for competitive classification performances.
arXiv Detail & Related papers (2021-04-19T00:41:43Z)
Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously. We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z)
Estimating Average Treatment Effects with Support Vector Machines [77.34726150561087]
Support vector machine (SVM) is one of the most popular classification algorithms in the machine learning literature. We adapt SVM as a kernel-based weighting procedure that minimizes the maximum mean discrepancy between the treatment and control groups. We characterize the bias of causal effect estimation arising from this trade-off, connecting the proposed SVM procedure to the existing kernel balancing methods.
arXiv Detail & Related papers (2021-02-23T20:22:56Z)
AML-SVM: Adaptive Multilevel Learning with Support Vector Machines [0.0]
This paper proposes an adaptive multilevel learning framework for the nonlinear SVM. It improves the classification quality across the refinement process, and leverages multi-threaded parallel processing for better performance.
arXiv Detail & Related papers (2020-11-05T00:17:02Z)
Improved SVRG for quadratic functions [0.0]
We analyse an iterative algorithm to minimize quadratic functions whose Hessian matrix $H$ is the expectation of a random symmetric $dtimes d$ matrix. In several applications, including least-squares regressions, ridge regressions, linear discriminant analysis and regularized linear discriminant analysis, the running time of each iteration is proportional to $d$.
arXiv Detail & Related papers (2020-06-01T15:28:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.