Related papers: Efficient Learning of Minimax Risk Classifiers in High Dimensions

Efficient Learning of Minimax Risk Classifiers in High Dimensions

URL: http://arxiv.org/abs/2306.06649v1
Date: Sun, 11 Jun 2023 11:08:20 GMT
Title: Efficient Learning of Minimax Risk Classifiers in High Dimensions
Authors: Kartheek Bondugula and Santiago Mazuelas and Aritz P\'erez
Abstract summary: High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In this paper, we leverage such methods to obtain an efficient learning algorithm for the recently proposed minimax risk classifiers. Experiments on multiple high-dimensional datasets show that the proposed algorithm is efficient in high-dimensional scenarios.
Score: 3.093890460224435
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient learning. Constraint generation methods have recently enabled efficient learning of L1-regularized support vector machines (SVMs). In this paper, we leverage such methods to obtain an efficient learning algorithm for the recently proposed minimax risk classifiers (MRCs). The proposed iterative algorithm also provides a sequence of worst-case error probabilities and performs feature selection. Experiments on multiple high-dimensional datasets show that the proposed algorithm is efficient in high-dimensional scenarios. In addition, the worst-case error probability provides useful information about the classifier performance, and the features selected by the algorithm are competitive with the state-of-the-art.

Related papers

Machine Learning Training Optimization using the Barycentric Correction Procedure [0.0]
This study proposes combining machine learning algorithms with an efficient methodology known as the barycentric correction procedure (BCP) It was found that this combination provides significant benefits related to time in synthetic and real data without losing accuracy when the number of instances and dimensions increases.
arXiv Detail & Related papers (2024-03-01T13:56:36Z)
Gauge-optimal approximate learning for small data classification problems [0.0]
Small data learning problems are characterized by a discrepancy between the limited amount of response variable observations and the large feature space dimension. We propose the Gauge- Optimal Approximate Learning (GOAL) algorithm, which provides an analytically tractable joint solution to the reduction dimension, feature segmentation and classification problems. GOAL has been compared to other state-of-the-art machine learning (ML) tools on both synthetic data and challenging real-world applications from climate science and bioinformatics.
arXiv Detail & Related papers (2023-10-29T16:46:05Z)
Spectral Entry-wise Matrix Estimation for Low-Rank Reinforcement Learning [53.445068584013896]
We study matrix estimation problems arising in reinforcement learning (RL) with low-rank structure. In low-rank bandits, the matrix to be recovered specifies the expected arm rewards, and for low-rank Markov Decision Processes (MDPs), it may for example characterize the transition kernel of the MDP. We show that simple spectral-based matrix estimation approaches efficiently recover the singular subspaces of the matrix and exhibit nearly-minimal entry-wise error.
arXiv Detail & Related papers (2023-10-10T17:06:41Z)
Best-Subset Selection in Generalized Linear Models: A Fast and Consistent Algorithm via Splicing Technique [0.6338047104436422]
Best subset section has been widely regarded as the Holy Grail of problems of this type. We proposed and illustrated an algorithm for best subset recovery in mild conditions. Our implementation achieves approximately a fourfold speedup compared to popular variable selection toolkits.
arXiv Detail & Related papers (2023-08-01T03:11:31Z)
Provably Efficient Representation Learning with Tractable Planning in Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs) We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU) We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z)
Large-scale Optimization of Partial AUC in a Range of False Positive Rates [51.12047280149546]
The area under the ROC curve (AUC) is one of the most widely used performance measures for classification models in machine learning. We develop an efficient approximated gradient descent method based on recent practical envelope smoothing technique. Our proposed algorithm can also be used to minimize the sum of some ranked range loss, which also lacks efficient solvers.
arXiv Detail & Related papers (2022-03-03T03:46:18Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Run2Survive: A Decision-theoretic Approach to Algorithm Selection based on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime. We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive. In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z)
A novel embedded min-max approach for feature selection in nonlinear support vector machine classification [0.0]
We propose an embedded feature selection method based on a min-max optimization problem. By leveraging duality theory, we equivalently reformulate the min-max problem and solve it without further ado. The efficiency and usefulness of our approach are tested on several benchmark data sets.
arXiv Detail & Related papers (2020-04-21T09:40:38Z)
IVFS: Simple and Efficient Feature Selection for High Dimensional Topology Preservation [33.424663018395684]
We propose a simple and effective feature selection algorithm to enhance sample similarity preservation. The proposed algorithm is able to well preserve the pairwise distances, as well as topological patterns, of the full data.
arXiv Detail & Related papers (2020-04-02T23:05:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.