Fugu-MT 論文翻訳(概要): Vaporizer: Breaking Watermarking Schemes for Large Language Model Outputs

論文の概要: Vaporizer: Breaking Watermarking Schemes for Large Language Model Outputs

arxiv url: http://arxiv.org/abs/2605.07481v1
Date: Fri, 08 May 2026 09:24:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.952327
Title: Vaporizer: Breaking Watermarking Schemes for Large Language Model Outputs
Title（参考訳）: Vaporizer: 大規模言語モデル出力のためのウォーターマーキングスキームのブレークスルー
Authors: Jonathan Hong Jin Ng, Anh Tu Ngo, Anupam Chattopadhyay,
Abstract要約: 大規模言語モデル(LLM)の出力を透かし、最新の最先端のスキームについて検討する。我々は、修正テキスト攻撃の広範囲な収集に対して、これらの透かし手法の有効性を分析する。
参考スコア（独自算出の注目度）: 2.5756681494057045
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we investigate the recent state-of-the-art schemes for watermarking large language models (LLMs) outputs. These techniques are claimed to be robust, scalable and production-grade, aimed at promoting responsible usage of LLMs. We analyse the effectiveness of these watermarking techniques against an extensive collection of modified text attacks, which perform targeted semantic changes without altering the general meaning of the text content. Our approach encompasses multiple attack strategies, which include lexical alterations, machine translation, and even neural paraphrasing. The attack efficacy is measured with two target criteria - successful removal of the watermark and preservation of semantic content. We evaluate semantic preservation through BERT scores, text complexity measures, grammatical errors, and Flesch Reading Ease indices. The experimental results reveal varying levels of effectiveness among different watermarking models, with the same underlying result that it is possible to remove the watermark with reasonable effort. This study sheds light on the strengths and weaknesses of existing LLM watermarking systems, suggesting how they should be constructed to improve security of available schemes.
Abstract（参考訳）: 本稿では,大規模言語モデル(LLM)の出力を透かし,最新の最先端の手法について検討する。これらの技術は堅牢でスケーラブルでプロダクショングレードであり、LCMの責任ある使用を促進することを目的としている。テキスト内容の一般的な意味を変化させることなく、目的のセマンティックな変更を行うような、修正されたテキストアタックの広範囲な収集に対して、これらの透かし手法の有効性を解析する。我々のアプローチには、語彙変更、機械翻訳、さらには神経パラフレーズを含む複数の攻撃戦略が含まれています。攻撃効果は,透かしの除去とセマンティックな内容の保存の2つの基準で測定される。我々は,BERTスコア,テキスト複雑度測定,文法的誤り,フレッシュ読解の指標を用いて意味保存を評価する。実験の結果,異なる透かしモデル間で異なる効果のレベルが示され,同じ基礎となる結果が妥当な努力で透かしを除去することが可能である。本研究は,既存のLCM透かしシステムの長所と短所に光を当て,利用可能なスキームの安全性を向上させるためにどのように構築すべきかを示唆するものである。

論文の概要: Vaporizer: Breaking Watermarking Schemes for Large Language Model Outputs

関連論文リスト