"Adversarial Examples" for Proof-of-Learning
- URL: http://arxiv.org/abs/2108.09454v1
- Date: Sat, 21 Aug 2021 07:56:29 GMT
- Title: "Adversarial Examples" for Proof-of-Learning
- Authors: Rui Zhang, Jian Liu, Yuan Ding, Qingbiao Wu, and Kui Ren
- Abstract summary: Jia et al. proposed a new concept/mechanism named proof-of-learning (PoL)
PoL allows a prover to demonstrate ownership of a machine learning model by proving integrity of the training procedure.
We show that PoL is vulnerable to "adrialversa examples"
- Score: 32.438181794551035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In S&P '21, Jia et al. proposed a new concept/mechanism named
proof-of-learning (PoL), which allows a prover to demonstrate ownership of a
machine learning model by proving integrity of the training procedure. It
guarantees that an adversary cannot construct a valid proof with less cost (in
both computation and storage) than that made by the prover in generating the
proof. A PoL proof includes a set of intermediate models recorded during
training, together with the corresponding data points used to obtain each
recorded model. Jia et al. claimed that an adversary merely knowing the final
model and training dataset cannot efficiently find a set of intermediate models
with correct data points. In this paper, however, we show that PoL is
vulnerable to "adversarial examples"! Specifically, in a similar way as
optimizing an adversarial example, we could make an arbitrarily-chosen data
point "generate" a given model, hence efficiently generating intermediate
models with correct data points. We demonstrate, both theoretically and
empirically, that we are able to generate a valid proof with significantly less
cost than generating a proof by the prover, thereby we successfully break PoL.
Related papers
- Membership Inference Attacks Cannot Prove that a Model Was Trained On Your Data [27.18781946018255]
Training data proofs play a key role in recent lawsuits against foundation models trained on web-scale data.
Many prior works suggest to instantiate training data proofs using membership inference attacks.
We show that data extraction attacks and membership inference on special canary data can be used to create sound training data proofs.
arXiv Detail & Related papers (2024-09-29T21:49:32Z) - Lean-STaR: Learning to Interleave Thinking and Proving [53.923617816215774]
We present Lean-STaR, a framework for training language models to produce informal thoughts prior to each step of a proof.
Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark within the Lean theorem proving environment.
arXiv Detail & Related papers (2024-07-14T01:43:07Z) - Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving [41.23045212775232]
We demonstrate the benefit of training models that additionally learn from failed search paths.
Facing the lack of such trial-and-error data in existing open-source theorem-proving datasets, we curate a dataset on intuitionistic propositional logic theorems.
We compare our model trained on relatively short trial-and-error information (TrialMaster) with models trained only on the correct paths and discover that the former solves more unseen theorems with lower trial searches.
arXiv Detail & Related papers (2024-04-10T23:01:45Z) - Can Membership Inferencing be Refuted? [31.31060116447964]
We study the reliability of membership inference attacks in practice.
We show that a model owner can plausibly refute the result of a membership inference test on a data point $x$ by constructing a proof of repudiation.
Our results call for a re-evaluation of the implications of membership inference attacks in practice.
arXiv Detail & Related papers (2023-03-07T04:36:35Z) - Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be
Consistent [97.64313409741614]
We propose to enforce a emphconsistency property which states that predictions of the model on its own generated data are consistent across time.
We show that our novel training objective yields state-of-the-art results for conditional and unconditional generation in CIFAR-10 and baseline improvements in AFHQ and FFHQ.
arXiv Detail & Related papers (2023-02-17T18:45:04Z) - Proof-of-Learning is Currently More Broken Than You Think [41.3211535926634]
We introduce the first spoofing strategies that can be reproduced across different configurations of the Proof-of-Learning (PoL) verification.
We identify key vulnerabilities of PoL and systematically analyze the underlying assumptions needed for robust verification of a proof.
We conclude that one cannot develop a provably robust PoL verification mechanism without further understanding of optimization in deep learning.
arXiv Detail & Related papers (2022-08-06T19:07:07Z) - Generating Natural Language Proofs with Verifier-Guided Search [74.9614610172561]
We present a novel stepwise method NLProofS (Natural Language Proof Search)
NLProofS learns to generate relevant steps conditioning on the hypothesis.
It achieves state-of-the-art performance on EntailmentBank and RuleTaker.
arXiv Detail & Related papers (2022-05-25T02:22:30Z) - A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures.
Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z) - Proof-of-Learning: Definitions and Practice [15.585184189361486]
Training machine learning (ML) models typically involves expensive iterative optimization.
There is currently no mechanism for the entity which trained the model to prove that these parameters were indeed the result of this optimization procedure.
This paper introduces the concept of proof-of-learning in ML.
arXiv Detail & Related papers (2021-03-09T18:59:54Z) - Variational Bayesian Unlearning [54.26984662139516]
We study the problem of approximately unlearning a Bayesian model from a small subset of the training data to be erased.
We show that it is equivalent to minimizing an evidence upper bound which trades off between fully unlearning from erased data vs. not entirely forgetting the posterior belief.
In model training with VI, only an approximate (instead of exact) posterior belief given the full data can be obtained, which makes unlearning even more challenging.
arXiv Detail & Related papers (2020-10-24T11:53:00Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.