Feature subset selection for kernel SVM classification via mixed-integer
optimization
- URL: http://arxiv.org/abs/2205.14325v1
- Date: Sat, 28 May 2022 04:01:40 GMT
- Title: Feature subset selection for kernel SVM classification via mixed-integer
optimization
- Authors: Ryuta Tamura, Yuichi Takano, Ryuhei Miyashiro
- Abstract summary: We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification.
First proposed for linear regression in the 1970s, this approach has recently moved into the spotlight with advances in optimization algorithms and computer hardware.
We propose a mixed-integer linear optimization (MILO) formulation based on the kernel-target alignment for feature subset selection, and this MILO problem can be solved to optimality using optimization software.
- Score: 0.7734726150561088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the mixed-integer optimization (MIO) approach to feature subset
selection in nonlinear kernel support vector machines (SVMs) for binary
classification. First proposed for linear regression in the 1970s, this
approach has recently moved into the spotlight with advances in optimization
algorithms and computer hardware. The goal of this paper is to establish an MIO
approach for selecting the best subset of features for kernel SVM
classification. To measure the performance of subset selection, we use the
kernel-target alignment, which is the distance between the centroids of two
response classes in a high-dimensional feature space. We propose a
mixed-integer linear optimization (MILO) formulation based on the kernel-target
alignment for feature subset selection, and this MILO problem can be solved to
optimality using optimization software. We also derive a reduced version of the
MILO problem to accelerate our MILO computations. Experimental results show
good computational efficiency for our MILO formulation with the reduced
problem. Moreover, our method can often outperform the linear-SVM-based MILO
formulation and recursive feature elimination in prediction performance,
especially when there are relatively few data instances.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs)
It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques.
Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z) - A Preconditioned Interior Point Method for Support Vector Machines Using
an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products [0.6445605125467574]
We propose an NFFT-accelerated matrix-vector product using an ANOVA decomposition for the feature space that is used within an interior point method for the overall optimization problem.
We investigate the performance of the different preconditioners as well as the accuracy of the ANOVA kernel on several large-scale datasets.
arXiv Detail & Related papers (2023-12-01T12:27:11Z) - DynamoRep: Trajectory-Based Population Dynamics for Classification of
Black-box Optimization Problems [0.755972004983746]
We propose a feature extraction method that describes the trajectories of optimization algorithms using simple statistics.
We demonstrate that the proposed DynamoRep features capture enough information to identify the problem class on which the optimization algorithm is running.
arXiv Detail & Related papers (2023-06-08T06:57:07Z) - A distribution-free mixed-integer optimization approach to hierarchical modelling of clustered and longitudinal data [0.0]
We introduce an innovative algorithm that evaluates cluster effects for new data points, thereby increasing the robustness and precision of this model.
The inferential and predictive efficacy of this approach is further illustrated through its application in student scoring and protein expression.
arXiv Detail & Related papers (2023-02-06T23:34:51Z) - Late Fusion Multi-view Clustering via Global and Local Alignment
Maximization [61.89218392703043]
Multi-view clustering (MVC) optimally integrates complementary information from different views to improve clustering performance.
Most of existing approaches directly fuse multiple pre-specified similarities to learn an optimal similarity matrix for clustering.
We propose late fusion MVC via alignment to address these issues.
arXiv Detail & Related papers (2022-08-02T01:49:31Z) - Handling Imbalanced Classification Problems With Support Vector Machines
via Evolutionary Bilevel Optimization [73.17488635491262]
Support vector machines (SVMs) are popular learning algorithms to deal with binary classification problems.
This article introduces EBCS-SVM: evolutionary bilevel cost-sensitive SVMs.
arXiv Detail & Related papers (2022-04-21T16:08:44Z) - Sparse Quadratic Optimisation over the Stiefel Manifold with Application
to Permutation Synchronisation [71.27989298860481]
We address the non- optimisation problem of finding a matrix on the Stiefel manifold that maximises a quadratic objective function.
We propose a simple yet effective sparsity-promoting algorithm for finding the dominant eigenspace matrix.
arXiv Detail & Related papers (2021-09-30T19:17:35Z) - Parallel Stochastic Mirror Descent for MDPs [72.75921150912556]
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs)
Some variant of Mirror Descent is proposed for convex programming problems with Lipschitz-continuous functionals.
We analyze this algorithm in a general case and obtain an estimate of the convergence rate that does not accumulate errors during the operation of the method.
arXiv Detail & Related papers (2021-02-27T19:28:39Z) - On the Efficient Implementation of the Matrix Exponentiated Gradient
Algorithm for Low-Rank Matrix Optimization [26.858608065417663]
Convex optimization over the spectrahedron has important applications in machine learning, signal processing and statistics.
We propose efficient implementations of MEG, which are tailored for optimization with low-rank matrices, and only use a single low-rank SVD on each iteration.
We also provide efficiently-computable certificates for the correct convergence of our methods.
arXiv Detail & Related papers (2020-12-18T19:14:51Z) - Multi-View Spectral Clustering with High-Order Optimal Neighborhood
Laplacian Matrix [57.11971786407279]
Multi-view spectral clustering can effectively reveal the intrinsic cluster structure among data.
This paper proposes a multi-view spectral clustering algorithm that learns a high-order optimal neighborhood Laplacian matrix.
Our proposed algorithm generates the optimal Laplacian matrix by searching the neighborhood of the linear combination of both the first-order and high-order base.
arXiv Detail & Related papers (2020-08-31T12:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.