New Evaluation Metrics Capture Quality Degradation due to LLM
Watermarking
- URL: http://arxiv.org/abs/2312.02382v1
- Date: Mon, 4 Dec 2023 22:56:31 GMT
- Title: New Evaluation Metrics Capture Quality Degradation due to LLM
Watermarking
- Authors: Karanpartap Singh, James Zou
- Abstract summary: We introduce two new easy-to-use methods for evaluating watermarking algorithms for large-language models (LLMs)
Our experiments, conducted across various datasets, reveal that current watermarking methods are detectable by even simple classifiers.
Our findings underscore the trade-off between watermark robustness and text quality and highlight the importance of having more informative metrics to assess watermarking quality.
- Score: 28.53032132891346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing use of large-language models (LLMs) like ChatGPT,
watermarking has emerged as a promising approach for tracing machine-generated
content. However, research on LLM watermarking often relies on simple
perplexity or diversity-based measures to assess the quality of watermarked
text, which can mask important limitations in watermarking. Here we introduce
two new easy-to-use methods for evaluating watermarking algorithms for LLMs: 1)
evaluation by LLM-judger with specific guidelines; and 2) binary classification
on text embeddings to distinguish between watermarked and unwatermarked text.
We apply these methods to characterize the effectiveness of current
watermarking techniques. Our experiments, conducted across various datasets,
reveal that current watermarking methods are detectable by even simple
classifiers, challenging the notion of watermarking subtlety. We also found,
through the LLM judger, that watermarking impacts text quality, especially in
degrading the coherence and depth of the response. Our findings underscore the
trade-off between watermark robustness and text quality and highlight the
importance of having more informative metrics to assess watermarking quality.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.