Fugu-MT 論文翻訳(概要): Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit

論文の概要: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit

arxiv url: http://arxiv.org/abs/2510.07226v1
Date: Wed, 08 Oct 2025 16:57:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-09 16:41:20.643588
Title: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit
Title（参考訳）: 群衆のマシン : Redditにおける機械学習テキストのフットプリントの測定
Authors: Lucio La Cava, Luca Maria Aiello, Andrea Tagarelli,
Abstract要約: Reddit 上で MGT (Machine-Generated Text) の大規模評価を行った。 MGT検出のための最先端統計手法を用いて,51個のサブレディットにわたる2年間の活動(2022-2024)を分析した。 MGTの有病率に関する非常に保守的な推定は、合成テキストがRedditにわずかに存在していることを示しているが、一部のコミュニティでは最大9%に達する可能性がある。
参考スコア（独自算出の注目度）: 8.318350327150437
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative Artificial Intelligence is reshaping online communication by enabling large-scale production of Machine-Generated Text (MGT) at low cost. While its presence is rapidly growing across the Web, little is known about how MGT integrates into social media environments. In this paper, we present the first large-scale characterization of MGT on Reddit. Using a state-of-the-art statistical method for detection of MGT, we analyze over two years of activity (2022-2024) across 51 subreddits representative of Reddit's main community types such as information seeking, social support, and discussion. We study the concentration of MGT across communities and over time, and compared MGT to human-authored text in terms of social signals it expresses and engagement it receives. Our very conservative estimate of MGT prevalence indicates that synthetic text is marginally present on Reddit, but it can reach peaks of up to 9% in some communities in some months. MGT is unevenly distributed across communities, more prevalent in subreddits focused on technical knowledge and social support, and often concentrated in the activity of a small fraction of users. MGT also conveys distinct social signals of warmth and status giving typical of language of AI assistants. Despite these stylistic differences, MGT achieves engagement levels comparable than human-authored content and in a few cases even higher, suggesting that AI-generated text is becoming an organic component of online social discourse. This work offers the first perspective on the MGT footprint on Reddit, paving the way for new investigations involving platform governance, detection strategies, and community dynamics.
Abstract（参考訳）: Generative Artificial Intelligenceは、Machine-Generated Text(MGT)を低コストで大量生産可能にすることで、オンラインコミュニケーションを再構築している。 Web全体での存在感は急速に高まりつつあるが、MGTがソーシャルメディア環境にどのように統合されるのかについては、ほとんど分かっていない。本稿では,Reddit上でMGTの大規模評価を行った。 MGT検出のための最先端統計手法を用いて,情報検索,ソーシャルサポート,議論などのRedditの主要なコミュニティタイプを代表する51のサブレディットを対象に,2年以上にわたる活動(2022-2024)を分析した。地域社会や時間とともにMGTの濃度を調査し、MGTが表現する社会的信号やエンゲージメントの点から人間によるテキストと比較した。 MGTの有病率に関する非常に保守的な推定は、合成テキストがRedditにわずかに存在していることを示しているが、いくつかのコミュニティでは数ヶ月で最大9%に達する可能性がある。 MGTはコミュニティに均等に分散しており、技術的知識と社会的支援に焦点を当てたサブレディットで普及しており、少数のユーザーの活動に集中していることが多い。 MGTはまた、AIアシスタントの典型的な言語を与える温かさと地位の異なる社会的シグナルも伝達している。これらのスタイリスティックな違いにもかかわらず、MGTは人間が書いたコンテンツに匹敵するエンゲージメントレベルを達成し、さらにいくつかのケースでは、AI生成テキストがオンライン社会談話のオーガニックな構成要素になりつつあることを示唆している。この研究は、RedditにおけるMGTのフットプリントに関する最初の視点を提供し、プラットフォームガバナンス、検出戦略、コミュニティのダイナミクスに関する新たな調査の道を開く。

論文の概要: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit

関連論文リスト