Robust Unsupervised Learning via L-Statistic Minimization
- URL: http://arxiv.org/abs/2012.07399v3
- Date: Thu, 18 Feb 2021 20:50:27 GMT
- Title: Robust Unsupervised Learning via L-Statistic Minimization
- Authors: Andreas Maurer, Daniela A. Parletta, Andrea Paudice, Massimiliano
Pontil
- Abstract summary: We present a general approach to this problem focusing on unsupervised learning.
The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models.
We prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning.
- Score: 38.49191945141759
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Designing learning algorithms that are resistant to perturbations of the
underlying data distribution is a problem of wide practical and theoretical
importance. We present a general approach to this problem focusing on
unsupervised learning. The key assumption is that the perturbing distribution
is characterized by larger losses relative to a given class of admissible
models. This is exploited by a general descent algorithm which minimizes an
$L$-statistic criterion over the model class, weighting small losses more. Our
analysis characterizes the robustness of the method in terms of bounds on the
reconstruction error relative to the underlying unperturbed distribution. As a
byproduct, we prove uniform convergence bounds with respect to the proposed
criterion for several popular models in unsupervised learning, a result which
may be of independent interest.Numerical experiments with kmeans clustering and
principal subspace analysis demonstrate the effectiveness of our approach.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - On the KL-Divergence-based Robust Satisficing Model [2.425685918104288]
robustness satisficing framework has attracted increasing attention from academia.
We present analytical interpretations, diverse performance guarantees, efficient and stable numerical methods, convergence analysis, and an extension tailored for hierarchical data structures.
We demonstrate the superior performance of our model compared to state-of-the-art benchmarks.
arXiv Detail & Related papers (2024-08-17T10:05:05Z) - Random Matrix Analysis to Balance between Supervised and Unsupervised
Learning under the Low Density Separation Assumption [9.620832983703863]
We introduce QLDS, a linear classification model, where the low density separation assumption is implemented via quadratic margin.
We show that particular cases of our algorithm are the least-square support vector machine in the supervised case, the spectral clustering in the fully unsupervised regime, and a class of semi-supervised graph-based approaches.
arXiv Detail & Related papers (2023-10-20T11:46:12Z) - A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment
for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes.
On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z) - Robust Distributed Learning: Tight Error Bounds and Breakdown Point
under Data Heterogeneity [11.2120847961379]
We consider in this paper a more realistic heterogeneity model, namely (G,B)-gradient dissimilarity, and show that it covers a larger class of learning problems than existing theory.
We also prove a new lower bound on the learning error of any distributed learning algorithm.
arXiv Detail & Related papers (2023-09-24T09:29:28Z) - Fine-grained analysis of non-parametric estimation for pairwise learning [9.676007573960383]
We are concerned with the generalization performance of non-parametric estimation for pairwise learning.
Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC, pairwise regression and metric and similarity learning.
arXiv Detail & Related papers (2023-05-31T08:13:14Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Outlier-Robust Learning of Ising Models Under Dobrushin's Condition [57.89518300699042]
We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted.
Our main result is to provide the first computationally efficient robust learning algorithm for this problem with near-optimal error guarantees.
arXiv Detail & Related papers (2021-02-03T18:00:57Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.