Fugu-MT 論文翻訳(概要): On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection

論文の概要: On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection

arxiv url: http://arxiv.org/abs/2510.03944v1
Date: Sat, 04 Oct 2025 21:07:06 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-07 16:52:59.353066
Title: On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
Title（参考訳）: 透かし検出における適合性試験の実証力について
Authors: Weiqing He, Xiang Li, Tianqi Shang, Li Shen, Weijie Su, Qi Long,
Abstract要約: 3つの一般的な透かし方式で8つのGoF試験を系統的に評価した。その結果,GoF試験は透かし検出器の検出能力とロバスト性の両方を向上できることがわかった。
参考スコア（独自算出の注目度）: 17.920479593691255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) raise concerns about content authenticity and integrity because they can generate human-like text at scale. Text watermarks, which embed detectable statistical signals into generated text, offer a provable way to verify content origin. Many detection methods rely on pivotal statistics that are i.i.d. under human-written text, making goodness-of-fit (GoF) tests a natural tool for watermark detection. However, GoF tests remain largely underexplored in this setting. In this paper, we systematically evaluate eight GoF tests across three popular watermarking schemes, using three open-source LLMs, two datasets, various generation temperatures, and multiple post-editing methods. We find that general GoF tests can improve both the detection power and robustness of watermark detectors. Notably, we observe that text repetition, common in low-temperature settings, gives GoF tests a unique advantage not exploited by existing methods. Our results highlight that classic GoF tests are a simple yet powerful and underused tool for watermark detection in LLMs.
Abstract（参考訳）: 大規模言語モデル(LLM)は、人間のようなテキストを大規模に生成できるため、コンテンツの信頼性と完全性に関する懸念を提起する。検出可能な統計信号を生成されたテキストに埋め込むテキスト透かしは、コンテンツの起源を検証するための証明可能な方法を提供する。多くの検出法は、人間の文章で書かれた重要な統計に頼っているため、GoF(Go-of-fit)テストは透かし検出の自然なツールである。しかし、この設定ではGoFテストはほとんど未調査のままである。本稿では,3つのオープンソースのLCM,2つのデータセット,様々な生成温度,複数の後編集手法を用いて,3つの一般的な透かし方式を対象とした8つのGoF試験を体系的に評価する。一般的なGoFテストは、透かし検出器の検出能力とロバスト性の両方を改善することができる。特に、低温設定で一般的なテキスト繰り返しは、GoFテストが既存のメソッドでは利用できない独特な利点をもたらすことを観察する。以上の結果から,従来のGoFテストはLLMにおける透かし検出のツールとして,シンプルだが強力で未使用のツールであることがわかった。

論文の概要: On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection

関連論文リスト