Fugu-MT 論文翻訳(概要): Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources

論文の概要: Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources

arxiv url: http://arxiv.org/abs/2605.23684v1
Date: Fri, 22 May 2026 14:33:52 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 17:29:20.389946
Title: Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources
Title（参考訳）: 合成源?:AI生成源のエビデンスのための生成検索エンジン Citation の検討
Authors: Mowafak Allaham, Nicholas Diakopoulos,
Abstract要約: 生成検索エンジンが合成源を引用して確実に省略できるかどうかは不明である。本研究は,実世界の712件の人間生成クエリを用いた4つの生成検索エンジンのオーディションを示す。以上の結果から,AI生成源が4つの生成検索エンジンにまたがって引用されている証拠が得られた。
参考スコア（独自算出の注目度）: 3.2065257821139195
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The growing accessibility of Large Language Models via conversational interfaces capable of responding to users' questions by drawing on, synthesizing, and citing information from the web (i.e., Generative Search Engines) has simplified the information-seeking process for users. However, with the proliferation of AI-generated content on the web, it is unclear whether these engines can reliably omit citing synthetic sources (i.e., AI-generated sources). Should these engines be unable to do so, this puts users at risk of harm by treating information from AI-generated sources synthesized in responses of generative search engines as equivalent to information from authoritative or official sources. In a step towards identifying whether AI-generated sources are being cited by these engines, this work presents an audit of four generative search engines (ChatGPT, Copilot, Gemini, Perplexity) using a total of 712 real-world human-generated queries spanning domains of public importance: politics, health, and the environment. Our findings show evidence of AI-generated sources being cited across all four generative search engines (~16% of cited sources) and identifies key source web domains these sources belong to that are frequently cited across these engines and topics. In addition, we observed that generative search engines include a somewhat narrow set of repeatedly cited domains while predominantly surfacing a large number of minimally cited domains in responses to users' queries. These findings contribute to the growing body of work on assessing the risks of generative search engines with the objective of increasing public awareness of their limitations and encouraging appropriate measures to improve information quality and governance of these systems.
Abstract（参考訳）: 対話型インタフェースによる大規模言語モデルのアクセシビリティの向上により,Webからの情報(ジェネレーティブ検索エンジン)を描画,合成,引用することで,ユーザの質問に答えることが可能になった。しかし、Web上のAI生成コンテンツの普及に伴い、これらのエンジンが合成ソース(つまりAI生成ソース)を引用して確実に省略できるかどうかは不明である。これらのエンジンがそうできなければ、ユーザーは、生成検索エンジンの応答で合成されたAI生成ソースから情報を、権威または公式情報源の情報と同等に扱うことで、有害なリスクを被ることになる。これらのエンジンによってAI生成ソースが引用されているかどうかを確認するために、この研究は、政治、健康、環境の4つの生成検索エンジン(ChatGPT、Copilot、Gemini、Perplexity)の監査を行う。その結果,4つの生成検索エンジンでAI生成ソースが引用されている証拠(引用されたソースの約16%)と,これらのソースが属する主要なソースWebドメインが,これらのエンジンやトピックで頻繁に引用されていることを示す。さらに, 生成検索エンジンには, 繰り返し参照されるドメインの集合がやや狭く, ユーザのクエリに対する応答として, 最小限参照されるドメインが多数存在することも確認した。これらの知見は,これらのシステムの情報品質とガバナンスを改善するための適切な対策を奨励し,その限界に対する一般の意識を高めることを目的として,生成検索エンジンのリスクを評価するための活動の活発化に寄与する。

論文の概要: Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources

関連論文リスト