Fugu-MT 論文翻訳(概要): $k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detection

論文の概要: $k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detection

arxiv url: http://arxiv.org/abs/2604.02008v1
Date: Thu, 02 Apr 2026 13:11:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-03 14:21:10.803358
Title: $k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detection
Title（参考訳）: $k$NNProxy: Black-Box Zero-Shot LLM-Generated Text Detectionのための効率的なトレーニングフリープロキシアライメント
Authors: Kahim Wong, Kemou Li, Haiwei Wu, Jiantao Zhou,
Abstract要約: 既存のLGT検出器は、学習ベースのアプローチとゼロショットメソッドの2つの幅広いクラスに分類される。ゼロショット法の信頼性は、オフザシェルフプロキシLDMがしばしば未知のソースとよく一致しているという仮定に依存している。トレーニング不要でクエリ効率のよいプロキシアライメントフレームワークである$k$-nearest neighbor proxy(k$NN Proxy)を提案する。
参考スコア（独自算出の注目度）: 19.213077720525696
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM-generated text (LGT) detection is essential for reliable forensic analysis and for mitigating LLM misuse. Existing LGT detectors can generally be categorized into two broad classes: learning-based approaches and zero-shot methods. Compared with learning-based detectors, zero-shot methods are particularly promising because they eliminate the need to train task-specific classifiers. However, the reliability of zero-shot methods fundamentally relies on the assumption that an off-the-shelf proxy LLM is well aligned with the often unknown source LLM, a premise that rarely holds in real-world black-box scenarios. To address this discrepancy, existing proxy alignment methods typically rely on supervised fine-tuning of the proxy or repeated interactions with commercial APIs, thereby increasing deployment costs, exposing detectors to silent API changes, and limiting robustness under domain shift. Motivated by these limitations, we propose the $k$-nearest neighbor proxy ($k$NNProxy), a training-free and query-efficient proxy alignment framework that repurposes the $k$NN language model ($k$NN-LM) retrieval mechanism as a domain adapter for a fixed proxy LLM. Specifically, a lightweight datastore is constructed once from a target-reflective LGT corpus, either via fixed-budget querying or from existing datasets. During inference, nearest-neighbor evidence induces a token-level predictive distribution that is interpolated with the proxy output, yielding an aligned prediction without proxy fine-tuning or per-token API outputs. To improve robustness under domain shift, we extend $k$NNProxy into a mixture of proxies (MoP) that routes each input to a domain-specific datastore for domain-consistent retrieval. Extensive experiments demonstrate strong detection performance of our method.
Abstract（参考訳）: LLM生成テキスト(LGT)検出は、信頼性のある法医学的分析とLLM誤用軽減に不可欠である。既存のLGT検出器は一般に、学習に基づくアプローチとゼロショット法という2つの幅広いクラスに分類される。学習ベースの検出器と比較すると、ゼロショット法はタスク固有の分類器を訓練する必要がなくなるため、特に有望である。しかし、ゼロショットメソッドの信頼性は、オフザシェルフプロキシLDMが、現実のブラックボックスシナリオではめったに持たない、しばしば未知のソースLDMとよく一致しているという仮定に基本的に依存している。この不一致に対処するため、既存のプロキシアライメントメソッドは一般的に、プロキシの監督された微調整や商用APIとの繰り返しのインタラクションに依存し、デプロイメントコストを増大させ、検出をサイレントAPIの変更に公開し、ドメインシフト時の堅牢性を制限する。固定プロキシ LLM のドメインアダプタとして $k$NN 言語モデル (k$NN-LM) 検索機構を再利用した,トレーニング不要でクエリ効率のよいプロキシアライメントフレームワークである $k$-nearest neighbor proxy (k$NNProxy) を提案する。具体的には,目標反射型LGTコーパスから,固定予算クエリあるいは既存のデータセットから,軽量なデータストアを構築する。推論中、最も近い隣のエビデンスは、プロキシ出力と補間されたトークンレベルの予測分布を誘導し、プロキシの微調整やトークン単位のAPI出力なしで整列予測を生成する。ドメインシフト下でのロバスト性を改善するために、$k$NNProxyを一連のプロキシ(MoP)に拡張し、各入力をドメイン固有のデータストアにルーティングし、ドメイン一貫性のある検索を行う。大規模な実験により,本手法の強い検出性能が示された。

論文の概要: $k$NNProxy: Efficient Training-Free Proxy Alignment for Black-Box Zero-Shot LLM-Generated Text Detection

関連論文リスト