Online Forgetting Process for Linear Regression Models
- URL: http://arxiv.org/abs/2012.01668v1
- Date: Thu, 3 Dec 2020 02:57:53 GMT
- Title: Online Forgetting Process for Linear Regression Models
- Authors: Yuantong Li, Chi-hua Wang, Guang Cheng
- Abstract summary: Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a study of statistical data deletion problems.
We propose a deletion-aware algorithm textttFIFD-OLS for the low dimensional case, and witness a catastrophic rank swinging phenomenon.
As a remedy, we propose the textttFIFD-Adaptive Ridge algorithm with a novel online regularization scheme.
- Score: 18.336825781223034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a study
of statistical data deletion problems where users' data are accessible only for
a limited period of time. This setting is formulated as an online supervised
learning task with \textit{constant memory limit}. We propose a deletion-aware
algorithm \texttt{FIFD-OLS} for the low dimensional case, and witness a
catastrophic rank swinging phenomenon due to the data deletion operation, which
leads to statistical inefficiency. As a remedy, we propose the
\texttt{FIFD-Adaptive Ridge} algorithm with a novel online regularization
scheme, that effectively offsets the uncertainty from deletion. In theory, we
provide the cumulative regret upper bound for both online forgetting
algorithms. In the experiment, we showed \texttt{FIFD-Adaptive Ridge}
outperforms the ridge regression algorithm with fixed regularization level, and
hopefully sheds some light on more complex statistical models.
Related papers
- Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation [49.53202761595912]
Continual Test-Time Adaptation involves adapting a pre-trained source model to continually changing unsupervised target domains.
We analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting.
We propose an uncertainty-aware buffering approach to identify and aggregate significant samples with high certainty from the unsupervised, single-pass data stream.
arXiv Detail & Related papers (2024-07-12T15:48:40Z) - Adaptive debiased SGD in high-dimensional GLMs with streaming data [4.704144189806667]
We introduce a novel approach to online inference in high-dimensional generalized linear models.
Our method operates in a single-pass mode, significantly reducing both time and space complexity.
We demonstrate that our method, termed the Approximated Debiased Lasso (ADL), not only mitigates the need for the bounded individual probability condition but also significantly improves numerical performance.
arXiv Detail & Related papers (2024-05-28T15:36:48Z) - Online Tensor Inference [0.0]
Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for high-dimensional tensor data.
Existing low-rank tensor methods lack the capability for statistical inference in an online fashion.
Our approach employs Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements.
arXiv Detail & Related papers (2023-12-28T16:37:48Z) - Iteratively Refined Behavior Regularization for Offline Reinforcement
Learning [57.10922880400715]
In this paper, we propose a new algorithm that substantially enhances behavior-regularization based on conservative policy iteration.
By iteratively refining the reference policy used for behavior regularization, conservative policy update guarantees gradually improvement.
Experimental results on the D4RL benchmark indicate that our method outperforms previous state-of-the-art baselines in most tasks.
arXiv Detail & Related papers (2023-06-09T07:46:24Z) - Streaming Sparse Linear Regression [1.8707139489039097]
We propose a novel online sparse linear regression framework for analyzing streaming data when data points arrive sequentially.
Our proposed method is memory efficient and requires less stringent restricted strong convexity assumptions.
arXiv Detail & Related papers (2022-11-11T07:31:55Z) - A Regression Approach to Learning-Augmented Online Algorithms [17.803569868141647]
We introduce this approach in this paper, and explore it in the context of a general online search framework.
We show nearly tight bounds on the sample complexity of this regression problem, and extend our results to the agnostic setting.
From a technical standpoint, we show that the key is to incorporate online optimization benchmarks in the design of the loss function for the regression problem.
arXiv Detail & Related papers (2022-05-18T04:29:14Z) - Implicit Parameter-free Online Learning with Truncated Linear Models [51.71216912089413]
parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
arXiv Detail & Related papers (2022-03-19T13:39:49Z) - Stochastic Online Linear Regression: the Forward Algorithm to Replace
Ridge [24.880035784304834]
We derive high probability regret bounds for online ridge regression and the forward algorithm.
This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions.
arXiv Detail & Related papers (2021-11-02T13:57:53Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Least Squares Regression with Markovian Data: Fundamental Limits and
Algorithms [69.45237691598774]
We study the problem of least squares linear regression where the data-points are dependent and are sampled from a Markov chain.
We establish sharp information theoretic minimax lower bounds for this problem in terms of $tau_mathsfmix$.
We propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate.
arXiv Detail & Related papers (2020-06-16T04:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.