A Mathematical Programming approach to Binary Supervised Classification
with Label Noise
- URL: http://arxiv.org/abs/2004.10170v1
- Date: Tue, 21 Apr 2020 17:25:54 GMT
- Title: A Mathematical Programming approach to Binary Supervised Classification
with Label Noise
- Authors: V\'ictor Blanco, Alberto Jap\'on and Justo Puerto
- Abstract summary: We propose novel methodologies to construct Support Vector Machine -based classifiers.
The first method incorporates relabeling directly in the SVM model.
A second family of methods combines clustering with classification at the same time, giving rise to a model that applies simultaneously similarity measures and SVM.
- Score: 1.2031796234206138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we propose novel methodologies to construct Support Vector
Machine -based classifiers that takes into account that label noises occur in
the training sample. We propose different alternatives based on solving Mixed
Integer Linear and Non Linear models by incorporating decisions on relabeling
some of the observations in the training dataset. The first method incorporates
relabeling directly in the SVM model while a second family of methods combines
clustering with classification at the same time, giving rise to a model that
applies simultaneously similarity measures and SVM. Extensive computational
experiments are reported based on a battery of standard datasets taken from UCI
Machine Learning repository, showing the effectiveness of the proposed
approaches.
Related papers
- Adaptive Transfer Clustering: A Unified Framework [2.3144964550307496]
We propose an adaptive transfer clustering (ATC) algorithm that automatically leverages the commonality in the presence of unknown discrepancy.
It applies to a broad class of statistical models including Gaussian mixture models, block models, and latent class models.
arXiv Detail & Related papers (2024-10-28T17:57:06Z) - Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Methods for Class-Imbalanced Learning with Support Vector Machines: A Review and an Empirical Evaluation [22.12895887111828]
We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning.
We compare the performances of various representative SVM-based models in each category using benchmark imbalanced data sets.
Our findings reveal that while algorithmic methods are less time-consuming owing to no data pre-processing requirements, fusion methods, which combine both re-sampling and algorithmic approaches, generally perform the best.
arXiv Detail & Related papers (2024-06-05T15:55:08Z) - Combining X-Vectors and Bayesian Batch Active Learning: Two-Stage Active Learning Pipeline for Speech Recognition [0.0]
This paper introduces a novel two-stage active learning pipeline for automatic speech recognition (ASR)
The first stage utilizes unsupervised AL by using x-vectors clustering for diverse sample selection from unlabeled speech data.
The second stage incorporates a supervised AL strategy, with a batch AL method specifically developed for ASR.
arXiv Detail & Related papers (2024-05-03T19:24:41Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Ensemble Classifier Design Tuned to Dataset Characteristics for Network
Intrusion Detection [0.0]
Two new algorithms are proposed to address the class overlap issue in the dataset.
The proposed design is evaluated for both binary and multi-category classification.
arXiv Detail & Related papers (2022-05-08T21:06:42Z) - Visualizing Classifier Adjacency Relations: A Case Study in Speaker
Verification and Voice Anti-Spoofing [72.4445825335561]
We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers.
Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores.
While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
arXiv Detail & Related papers (2021-06-11T13:03:33Z) - Improving Deep Learning Sound Events Classifiers using Gram Matrix
Feature-wise Correlations [1.2891210250935146]
In our method, we analyse all the activations of a generic CNN in order to produce feature representations using Gram Matrices.
The proposed approach can be applied to any CNN and our experimental evaluation of four different architectures on two datasets demonstrated that our method consistently improves the baseline models.
arXiv Detail & Related papers (2021-02-23T16:08:02Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.