Towards Optimal Statistical Watermarking
- URL: http://arxiv.org/abs/2312.07930v3
- Date: Tue, 6 Feb 2024 21:01:28 GMT
- Title: Towards Optimal Statistical Watermarking
- Authors: Baihe Huang and Hanlin Zhu and Banghua Zhu and Kannan Ramchandran and
Michael I. Jordan and Jason D. Lee and Jiantao Jiao
- Abstract summary: We study statistical watermarking by formulating it as a hypothesis testing problem.
Key to our formulation is a coupling of the output tokens and the rejection region.
We characterize the Uniformly Most Powerful (UMP) watermark in the general hypothesis testing setting.
- Score: 95.46650092476372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study statistical watermarking by formulating it as a hypothesis testing
problem, a general framework which subsumes all previous statistical
watermarking methods. Key to our formulation is a coupling of the output tokens
and the rejection region, realized by pseudo-random generators in practice,
that allows non-trivial trade-offs between the Type I error and Type II error.
We characterize the Uniformly Most Powerful (UMP) watermark in the general
hypothesis testing setting and the minimax Type II error in the model-agnostic
setting. In the common scenario where the output is a sequence of $n$ tokens,
we establish nearly matching upper and lower bounds on the number of i.i.d.
tokens required to guarantee small Type I and Type II errors. Our rate of
$\Theta(h^{-1} \log (1/h))$ with respect to the average entropy per token $h$
highlights potentials for improvement from the rate of $h^{-2}$ in the previous
works. Moreover, we formulate the robust watermarking problem where the user is
allowed to perform a class of perturbations on the generated texts, and
characterize the optimal Type II error of robust UMP tests via a linear
programming problem. To the best of our knowledge, this is the first systematic
statistical treatment on the watermarking problem with near-optimal rates in
the i.i.d. setting, which might be of interest for future works.
Related papers
- Point Prediction for Streaming Data [27.938266762930994]
We present two new approaches for point prediction with streaming data.
One is based on the Count-Min sketch (CMS) and the other is based on Gaussian process priors with a random bias.
arXiv Detail & Related papers (2024-08-02T15:12:52Z) - Dirichlet-Based Prediction Calibration for Learning with Noisy Labels [40.78497779769083]
Learning with noisy labels can significantly hinder the generalization performance of deep neural networks (DNNs)
Existing approaches address this issue through loss correction or example selection methods.
We propose the textitDirichlet-based Prediction (DPC) method as a solution.
arXiv Detail & Related papers (2024-01-13T12:33:04Z) - Improved Convergence of Score-Based Diffusion Models via Prediction-Correction [15.772322871598085]
Score-based generative models (SGMs) are powerful tools to sample from complex data distributions.
This paper addresses the issue by considering a version of the popular predictor-corrector scheme.
We first estimate the final distribution via an inexact Langevin dynamics and then revert the process.
arXiv Detail & Related papers (2023-05-23T15:29:09Z) - Transformers meet Stochastic Block Models: Attention with Data-Adaptive
Sparsity and Cost [53.746169882193456]
Recent works have proposed various sparse attention modules to overcome the quadratic cost of self-attention.
We propose a model that resolves both problems by endowing each attention head with a mixed-membership Block Model.
Our model outperforms previous efficient variants as well as the original Transformer with full attention.
arXiv Detail & Related papers (2022-10-27T15:30:52Z) - DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition [1.278093617645299]
Anomaly detection can be conceived either through generative modelling of regular training data or by discriminating with respect to negative training data.
This paper presents a novel hybrid anomaly score which allows dense open-set recognition on large natural images.
Experiments evaluate our contributions on standard dense anomaly detection benchmarks as well as in terms of open-mIoU - a novel metric for dense open-set performance.
arXiv Detail & Related papers (2022-07-06T11:48:50Z) - A Conditional Randomization Test for Sparse Logistic Regression in
High-Dimension [36.00360315353985]
emphCRT-logit is an algorithm that combines a variable-distillation step and a decorrelation step.
We provide a theoretical analysis of this procedure, and demonstrate its effectiveness on simulations, along with experiments on large-scale brain-imaging and genomics datasets.
arXiv Detail & Related papers (2022-05-29T09:37:16Z) - A Universal Error Measure for Input Predictions Applied to Online Graph
Problems [57.58926849872494]
We introduce a novel measure for quantifying the error in input predictions.
The measure captures errors due to absent predicted requests as well as unpredicted actual requests.
arXiv Detail & Related papers (2022-05-25T15:24:03Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.