Related papers: An Efficient Variant of One-Class SVM with Lifelong Online Learning Guarantees

An Efficient Variant of One-Class SVM with Lifelong Online Learning Guarantees

URL: http://arxiv.org/abs/2512.11052v1
Date: Thu, 11 Dec 2025 19:09:58 GMT
Title: An Efficient Variant of One-Class SVM with Lifelong Online Learning Guarantees
Authors: Joe Suk, Samory Kpotufe,
Abstract summary: We introduce SONAR, an efficient SGD-based OCSVM solver with strongly convex regularization.<n>We show novel theoretical guarantees on the Type I/II errors of SONAR, superior to those known for OCSVM.<n>In the more challenging problem of adversarial non-stationary data, we show that SONAR can be used within an ensemble method.
Score: 11.960178399478721
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study outlier (a.k.a., anomaly) detection for single-pass non-stationary streaming data. In the well-studied offline or batch outlier detection problem, traditional methods such as kernel One-Class SVM (OCSVM) are both computationally heavy and prone to large false-negative (Type II) errors under non-stationarity. To remedy this, we introduce SONAR, an efficient SGD-based OCSVM solver with strongly convex regularization. We show novel theoretical guarantees on the Type I/II errors of SONAR, superior to those known for OCSVM, and further prove that SONAR ensures favorable lifelong learning guarantees under benign distribution shifts. In the more challenging problem of adversarial non-stationary data, we show that SONAR can be used within an ensemble method and equipped with changepoint detection to achieve adaptive guarantees, ensuring small Type I/II errors on each phase of data. We validate our theoretical findings on synthetic and real-world datasets.

Related papers

Refining Decision Boundaries In Anomaly Detection Using Similarity Search Within the Feature Space [3.3202103799131795]
We introduce SDA2E, a Sparse Dual Adversarial Attention-based AutoEncoder designed to learn compact and discriminative latent representations from imbalanced, high-dimensional data.<n>We propose a similarity-guided active learning framework that integrates three novel strategies to refine decision boundaries efficiently.<n>We evaluate SDA2E extensively across 52 imbalanced datasets, including multiple DARPA Transparent Computing scenarios, and benchmark it against 15 state-of-the-art anomaly detection methods.
arXiv Detail & Related papers (2026-02-02T23:55:08Z)
Calibratable Disambiguation Loss for Multi-Instance Partial-Label Learning [53.9713678229744]
Multi-instance partial-label learning (MIPL) is a weakly supervised framework that addresses the challenges of inexact supervision in both instance and label spaces.<n>Existing MIPL approaches often suffer from poor calibration, undermining reliability.<n>We propose a plug-and-play calibratable disambiguation loss (CDL) that simultaneously improves classification accuracy and calibration performance.
arXiv Detail & Related papers (2025-12-19T16:58:31Z)
Leveraging Learning Bias for Noisy Anomaly Detection [19.23861148116995]
This paper addresses the challenge of fully unsupervised image anomaly detection (FUIAD)<n> Conventional methods assume anomaly-free training data, but real-world contamination leads models to absorb anomalies as normal.<n>We propose a two-stage framework that exploits inherent learning bias in models.
arXiv Detail & Related papers (2025-08-10T17:47:21Z)
Advancing Reliable Test-Time Adaptation of Vision-Language Models under Visual Variations [67.35596444651037]
Vision-language models (VLMs) exhibit remarkable zero-shot capabilities but struggle with distribution shifts in downstream tasks when labeled data is unavailable.<n>We propose a Reliable Test-time Adaptation (ReTA) method that enhances reliability from two perspectives.
arXiv Detail & Related papers (2025-07-13T05:37:33Z)
Joint-stochastic-approximation Autoencoders with Application to Semi-supervised Learning [16.625057220045292]
We present Joint-stochastic-approximation (JSA) autoencoders - a new family of algorithms for building deep directed generative models.<n> JSA learning algorithm directly maximizes the data log-likelihood and simultaneously minimizes the inclusive KL divergence between the posteriori and the inference model.<n>We empirically show that JSA autoencoders with discrete latent space achieve comparable performance to other state-of-the-art DGMs with continuous latent space in semi-supervised tasks.
arXiv Detail & Related papers (2025-05-24T06:52:23Z)
Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.<n>We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z)
FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data [1.0650780147044159]
We propose a novel learning-based approach for fully unsupervised anomaly detection with unlabeled and potentially contaminated training data. Our method is motivated by two observations, that i) the pairwise feature distances between the normal samples are on average likely to be smaller than those between the anomaly samples or heterogeneous samples and ii) pairs of features mutually closest to each other are likely to be homogeneous pairs. Building on the first observation that nearest-neighbor distances can distinguish between confident normal samples and anomalies, we propose a pseudo-labeling strategy using an iteratively reconstructed memory bank.
arXiv Detail & Related papers (2024-11-25T05:51:38Z)
A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification [61.473485511491795]
Semi-supervised learning (SSL) is a practical challenge in computer vision. Pseudo-label (PL) methods, e.g., FixMatch and FreeMatch, obtain the State Of The Art (SOTA) performances in SSL. We propose a lightweight channel-based ensemble method to consolidate multiple inferior PLs into the theoretically guaranteed unbiased and low-variance one.
arXiv Detail & Related papers (2024-03-27T09:49:37Z)
Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification. We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate. We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z)
MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL) We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately. Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z)
Semi-Supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization [62.21716612888669]
We propose two generic methods for improving semi-supervised learning (SSL) The first integrates weight perturbation (WP) into existing "consistency regularization" (CR) based methods. The second method proposes a novel consistency loss called "maximum uncertainty regularization" (MUR)
arXiv Detail & Related papers (2020-12-03T09:49:35Z)
Variational Auto-Encoder: not all failures are equal [0.0]
We show how sharpness learning addresses the notorious VAE blurriness issue. The paper is backed upon experiments on artificial data, MNIST and CelebA, showing how sharpness learning addresses the notorious VAE blurriness issue.
arXiv Detail & Related papers (2020-03-04T09:48:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.