Fugu-MT 論文翻訳(概要): Real, Fake, or Manipulated? Detecting Machine-Influenced Text

論文の概要: Real, Fake, or Manipulated? Detecting Machine-Influenced Text

arxiv url: http://arxiv.org/abs/2509.15350v1
Date: Thu, 18 Sep 2025 18:41:57 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-22 18:18:10.871239
Title: Real, Fake, or Manipulated? Detecting Machine-Influenced Text
Title（参考訳）: リアル、フェイク、または操作? 機械による影響のあるテキストの検出
Authors: Yitong Wang, Zhongping Zhang, Margherita Piana, Zheng Zhou, Peter Gerstoft, Bryan A. Plummer,
Abstract要約: 我々はHiErarchical, length-RObust machine-influenced text detector (HERO)を紹介する。 HEROは、人書き、機械生成、機械処理、機械翻訳の4つの主要なタイプから、さまざまな長さのテキストサンプルを分離することを学ぶ。
参考スコア（独自算出の注目度）: 56.32138057356434
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Model (LLMs) can be used to write or modify documents, presenting a challenge for understanding the intent behind their use. For example, benign uses may involve using LLM on a human-written document to improve its grammar or to translate it into another language. However, a document entirely produced by a LLM may be more likely to be used to spread misinformation than simple translation (\eg, from use by malicious actors or simply by hallucinating). Prior works in Machine Generated Text (MGT) detection mostly focus on simply identifying whether a document was human or machine written, ignoring these fine-grained uses. In this paper, we introduce a HiErarchical, length-RObust machine-influenced text detector (HERO), which learns to separate text samples of varying lengths from four primary types: human-written, machine-generated, machine-polished, and machine-translated. HERO accomplishes this by combining predictions from length-specialist models that have been trained with Subcategory Guidance. Specifically, for categories that are easily confused (\eg, different source languages), our Subcategory Guidance module encourages separation of the fine-grained categories, boosting performance. Extensive experiments across five LLMs and six domains demonstrate the benefits of our HERO, outperforming the state-of-the-art by 2.5-3 mAP on average.
Abstract（参考訳）: 大規模言語モデル(LLM)は文書の作成や修正に使用することができ、その使用の背後にある意図を理解する上での課題である。例えば、良性の使用には、人間の記述した文書にLLMを使用して文法を改善したり、別の言語に翻訳することがある。しかし、LLMによって完全に作成された文書は、単純な翻訳よりも誤情報を広めるために使われる可能性が高い(悪質な俳優による使用から、あるいは単に幻覚によって)。それまでのMGT(Machine Generated Text)検出では、ドキュメントが人間か機械かの判定に重点を置いていた。本稿では,HyErarchical, length-RObust machine-influenced text detector (HERO)を紹介する。 HEROは、サブカテゴリ誘導を用いて訓練された長さ特殊主義モデルの予測を組み合わせることで、これを達成している。具体的には、容易に混同されるカテゴリ(例えば、異なるソース言語)に対して、サブカテゴリガイダンスモジュールは、きめ細かいカテゴリの分離を促進し、パフォーマンスを向上します。 5つのLLMと6つのドメインにわたる大規模な実験は、HEROの利点を実証し、平均2.5-3mAPの最先端を上回りました。

論文の概要: Real, Fake, or Manipulated? Detecting Machine-Influenced Text

関連論文リスト