Ensemble Methods for Robust Support Vector Machines using Integer
Programming
- URL: http://arxiv.org/abs/2203.01606v1
- Date: Thu, 3 Mar 2022 10:03:54 GMT
- Title: Ensemble Methods for Robust Support Vector Machines using Integer
Programming
- Authors: Jannis Kurtz
- Abstract summary: We study binary classification problems where we assume that our training data is subject to uncertainty.
To tackle this issue in the field of robust machine learning the aim is to develop models which are robust against small perturbations in the training data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we study binary classification problems where we assume that our
training data is subject to uncertainty, i.e. the precise data points are not
known. To tackle this issue in the field of robust machine learning the aim is
to develop models which are robust against small perturbations in the training
data. We study robust support vector machines (SVM) and extend the classical
approach by an ensemble method which iteratively solves a non-robust SVM on
different perturbations of the dataset, where the perturbations are derived by
an adversarial problem. Afterwards for classification of an unknown data point
we perform a majority vote of all calculated SVM solutions. We study three
different variants for the adversarial problem, the exact problem, a relaxed
variant and an efficient heuristic variant. While the exact and the relaxed
variant can be modeled using integer programming formulations, the heuristic
one can be implemented by an easy and efficient algorithm. All derived methods
are tested on random and realistic datasets and the results indicate that the
derived ensemble methods have a much more stable behaviour when changing the
protection level compared to the classical robust SVM model.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Cost-sensitive probabilistic predictions for support vector machines [1.743685428161914]
Support vector machines (SVMs) are widely used and constitute one of the best examined and used machine learning models.
We propose a novel approach to generate probabilistic outputs for the SVM.
arXiv Detail & Related papers (2023-10-09T11:00:17Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - An Instance Selection Algorithm for Big Data in High imbalanced datasets
based on LSH [0.0]
Training Machine Learning models in real contexts often deals with big data sets and imbalance samples where the class of interest is unrepresented.
This work proposes three new methods for instance selection (IS) to be able to deal with large and imbalanced data sets.
Algorithms were developed in the Apache Spark framework, guaranteeing their scalability.
arXiv Detail & Related papers (2022-10-09T17:38:41Z) - A Novel Plug-and-Play Approach for Adversarially Robust Generalization [38.72514422694518]
We propose a robust framework that employs adversarially robust training to safeguard the ML models against perturbed testing data.
Our contributions can be seen from both computational and statistical perspectives.
arXiv Detail & Related papers (2022-08-19T17:02:55Z) - Primal Estimated Subgradient Solver for SVM for Imbalanced
Classification [0.0]
We aim to demonstrate that our cost sensitive PEGASOS SVM achieves good performance on imbalanced data sets with a Majority to Minority Ratio ranging from 8.6:1 to 130:1.
We evaluate the performance by examining the learning curves.
We benchmark our PEGASOS Cost-Sensitive SVM's results of Ding's LINEAR SVM DECIDL method.
arXiv Detail & Related papers (2022-06-19T02:33:14Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Data-Driven Robust Optimization using Unsupervised Deep Learning [0.0]
We show that a trained neural network can be integrated into a robust optimization model by formulating the adversarial problem as a convex mixed-integer program.
We find that this approach outperforms a similar approach using kernel-based support vector sets.
arXiv Detail & Related papers (2020-11-19T11:06:54Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.