Fugu-MT 論文翻訳(概要): MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text

論文の概要: MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text

arxiv url: http://arxiv.org/abs/2605.06903v1
Date: Thu, 07 May 2026 20:05:38 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:38.595136
Title: MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text
Title（参考訳）: MELD:AI生成テキストのためのマルチタスク等価学習検出器
Authors: Chenjun Li, Cheng Wan, Johannes C. Paetzold,
Abstract要約: MELDはAI生成テキストのデプロイ可能な検出器で、補助的な監視によってバイナリ検出を強化する。一般のRAIDリーダーボードでは、MELDは最強のオープンソース検出器である。 MELDは、ALD-evalで1%のFPRで99.9%のTPRを達成するが、多くのベースラインは急激に低下する。
参考スコア（独自算出の注目度）: 5.175537650981894
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models are now embedded in everyday writing workflows, making reliable AI-generated text detection important for academic integrity, content moderation, and provenance tracking. In practice, however, a detector must do more than achieve high aggregate AUROC on clean, in-distribution human and AI text: it should remain robust to attacks and adversarial rewrites, transfer to unseen generators and domains, and operate at low false-positive rates (FPR). Most existing detectors optimize a single AI/Human objective, giving the representation little incentive to learn generator, attack, or domain structure once the binary task saturates. We introduce MELD (Multi-Task Equilibrated Learning Detector), a deployable detector for AI-generated text that enriches binary detection with auxiliary supervision. MELD attaches generator-family, attack-type, and source-domain heads to a shared encoder, and balances the four losses with learned homoscedastic uncertainty weights. To improve robustness, an EMA teacher predicts on clean inputs while an attack-augmented student is distilled toward the teacher. MELD further uses a hard-negative pairwise ranking loss to enlarge the score margin between AI-generated texts and the most confusable human texts. At inference, all auxiliary heads are discarded, giving MELD the same interface and cost as a standard detector. On the public RAID leaderboard, MELD is the strongest open-source detector and is competitive with leading commercial models, especially under attack and at low FPR. Across standard held-out benchmarks, MELD matches or outperforms supervised baselines. We further introduce MELD-eval, a held-out evaluation pool built from recent chat models released by four major LLM providers. Without additional finetuning, MELD achieves 99.9% TPR at 1% FPR on MELD-eval, while many baselines degrade sharply.
Abstract（参考訳）: 大規模な言語モデルは、日々の書き込みワークフローに埋め込まれており、学術的完全性、コンテンツモデレーション、プロファイランストラッキングにおいて、信頼性の高いAI生成テキスト検出が重要である。しかし実際には、検知器は、クリーンで非配布の人間とAIのテキスト上で高い集約AUROCを達成する以上のことをしなければならない:攻撃や敵の書き換え、見えないジェネレータやドメインへの転送、低い偽陽性率(FPR)で動作すること。ほとんどの既存の検出器は、単一のAI/Human目標を最適化し、バイナリタスクが飽和するとジェネレータ、アタック、ドメイン構造を学ぶためのインセンティブをほとんど与えない。我々は,AI生成テキストのデプロイ可能な検出器であるMELD(Multi-Task Equilibrated Learning Detector)を導入する。 MELDは、ジェネレータファミリー、アタックタイプ、ソースドメインヘッドを共有エンコーダにアタッチし、4つの損失を学習されたホモシステマティック不確実性重みとバランスさせる。堅牢性を向上させるため、EMA教師は、攻撃強化された生徒が教師に向かって蒸留されている間、クリーンな入力を予測する。 MELDはさらに、AI生成したテキストと最も不愉快な人間のテキストの間のスコアマージンを拡大するために、ハードネガティブなペアワイズランキングの損失を使用する。推測では、全ての補助ヘッドは破棄され、MELDは標準検出器と同じインターフェースとコストを与える。一般のRAIDリーダーボードでは、MELDは最強のオープンソース検出器であり、特に攻撃や低FPRにおいて、主要な商用モデルと競合している。標準のホールドアウトベンチマーク、MELDマッチ、あるいは教師付きベースラインよりも優れています。さらに,4大LLMプロバイダが最近リリースしたチャットモデルから構築したホールドアウト評価プールであるMELD-evalについても紹介する。追加の微調整なしでは、MELD-evalでは1%のFPRで99.9%のTPRを達成するが、多くのベースラインは急激に低下する。

論文の概要: MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text

関連論文リスト