Online Robust and Adaptive Learning from Data Streams
- URL: http://arxiv.org/abs/2007.12160v2
- Date: Mon, 27 Sep 2021 18:36:16 GMT
- Title: Online Robust and Adaptive Learning from Data Streams
- Authors: Shintaro Fukushima and Atsushi Nitanda and Kenji Yamanishi
- Abstract summary: In online learning, it is necessary to learn robustly to outliers and to adapt quickly to changes in the underlying data generating mechanism.
In this paper, we refer to the former attribute of online learning algorithms as robustness and to the latter as adaptivity.
We propose a novel approximation-based robustness-adaptivity algorithm (SRA) to evaluate the tradeoff.
- Score: 22.319483572757097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In online learning from non-stationary data streams, it is necessary to learn
robustly to outliers and to adapt quickly to changes in the underlying data
generating mechanism. In this paper, we refer to the former attribute of online
learning algorithms as robustness and to the latter as adaptivity. There is an
obvious tradeoff between the two attributes. It is a fundamental issue to
quantify and evaluate the tradeoff because it provides important information on
the data generating mechanism. However, no previous work has considered the
tradeoff quantitatively. We propose a novel algorithm called the stochastic
approximation-based robustness-adaptivity algorithm (SRA) to evaluate the
tradeoff. The key idea of SRA is to update parameters of distribution or
sufficient statistics with the biased stochastic approximation scheme, while
dropping data points with large values of the stochastic update. We address the
relation between the two parameters: one is the step size of the stochastic
approximation, and the other is the threshold parameter of the norm of the
stochastic update. The former controls the adaptivity and the latter does the
robustness. We give a theoretical analysis for the non-asymptotic convergence
of SRA in the presence of outliers, which depends on both the step size and
threshold parameter. Because SRA is formulated on the majorization-minimization
principle, it is a general algorithm that includes many algorithms, such as the
online EM algorithm and stochastic gradient descent. Empirical experiments for
both synthetic and real datasets demonstrated that SRA was superior to previous
methods.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Online Estimation with Rolling Validation: Adaptive Nonparametric Estimation with Streaming Data [13.069717985067937]
We propose a weighted rolling-validation procedure, an online variant of leave-one-out cross-validation.
Similar to batch cross-validation, it can boost base estimators to achieve a better, adaptive convergence rate.
arXiv Detail & Related papers (2023-10-18T17:52:57Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms.
Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z) - A bandit-learning approach to multifidelity approximation [7.960229223744695]
Multifidelity approximation is an important technique in scientific computation and simulation.
We introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates.
arXiv Detail & Related papers (2021-03-29T05:29:35Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.