Certified Minimax Unlearning with Generalization Rates and Deletion
Capacity
- URL: http://arxiv.org/abs/2312.10336v1
- Date: Sat, 16 Dec 2023 06:03:23 GMT
- Title: Certified Minimax Unlearning with Generalization Rates and Deletion
Capacity
- Authors: Jiaqi Liu, Jian Lou, Zhan Qin and Kui Ren
- Abstract summary: We study the problem of $(epsilon,delta)$-certified machine unlearning for minimax models.
We develop a new minimax unlearning step of a total-Hessian-based complete update.
- Score: 31.679075045401742
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We study the problem of $(\epsilon,\delta)$-certified machine unlearning for
minimax models. Most of the existing works focus on unlearning from standard
statistical learning models that have a single variable and their unlearning
steps hinge on the direct Hessian-based conventional Newton update. We develop
a new $(\epsilon,\delta)$-certified machine unlearning algorithm for minimax
models. It proposes a minimax unlearning step consisting of a
total-Hessian-based complete Newton update and the Gaussian mechanism borrowed
from differential privacy. To obtain the unlearning certification, our method
injects calibrated Gaussian noises by carefully analyzing the "sensitivity" of
the minimax unlearning step (i.e., the closeness between the minimax unlearning
variables and the retraining-from-scratch variables). We derive the
generalization rates in terms of population strong and weak primal-dual risk
for three different cases of loss functions, i.e.,
(strongly-)convex-(strongly-)concave losses. We also provide the deletion
capacity to guarantee that a desired population risk can be maintained as long
as the number of deleted samples does not exceed the derived amount. With
training samples $n$ and model dimension $d$, it yields the order $\mathcal
O(n/d^{1/4})$, which shows a strict gap over the baseline method of
differentially private minimax learning that has $\mathcal O(n/d^{1/2})$. In
addition, our rates of generalization and deletion capacity match the
state-of-the-art rates derived previously for standard statistical learning
models.
Related papers
- Scaling Laws in Linear Regression: Compute, Parameters, and Data [86.48154162485712]
We study the theory of scaling laws in an infinite dimensional linear regression setup.
We show that the reducible part of the test error is $Theta(-(a-1) + N-(a-1)/a)$.
Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
arXiv Detail & Related papers (2024-06-12T17:53:29Z) - Robust Empirical Risk Minimization with Tolerance [24.434720137937756]
We study the fundamental paradigm of (robust) $textitempirical risk minimization$ (RERM)
We show that a natural tolerant variant of RERM is indeed sufficient for $gamma$-tolerant robust learning VC classes over $mathbbRd$.
arXiv Detail & Related papers (2022-10-02T21:26:15Z) - KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal [70.15267479220691]
We consider and analyze the sample complexity of model reinforcement learning with a generative variance-free model.
Our analysis shows that it is nearly minimax-optimal for finding an $varepsilon$-optimal policy when $varepsilon$ is sufficiently small.
arXiv Detail & Related papers (2022-05-27T19:39:24Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Autocalibration and Tweedie-dominance for Insurance Pricing with Machine
Learning [0.0]
It is shown that minimizing deviance involves a trade-off between the integral of weighted differences of lower partial moments and the bias measured on a specific scale.
This new method to correct for bias adds extra local GLM step to the analysis.
The convex order appears to be the natural tool to compare competing models.
arXiv Detail & Related papers (2021-03-05T12:40:30Z) - Remember What You Want to Forget: Algorithms for Machine Unlearning [37.656345901739506]
We study the problem of forgetting datapoints from a learnt model.
We provide an unlearning algorithm that can delete up to $O(n/d1/4)$ samples.
arXiv Detail & Related papers (2021-03-04T19:28:57Z) - Learning to extrapolate using continued fractions: Predicting the
critical temperature of superconductor materials [5.905364646955811]
In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(mathbfx)$ is a common objective.
We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $mathbfx$.
arXiv Detail & Related papers (2020-11-27T04:57:40Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Statistical-Query Lower Bounds via Functional Gradients [19.5924910463796]
We show that any statistical-query algorithm with tolerance $n- (1/epsilon)b$ must use at least $2nc epsilon$ queries for some constant $b.
Our results rule out general (as opposed to correlational) SQ learning algorithms, which is unusual for real-valued learning problems.
arXiv Detail & Related papers (2020-06-29T05:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.