Related papers: Generative AI Search Engines as Arbiters of Public Knowledge: An Audit of Bias and Authority

Generative AI Search Engines as Arbiters of Public Knowledge: An Audit of Bias and Authority

URL: http://arxiv.org/abs/2405.14034v1
Date: Wed, 22 May 2024 22:09:32 GMT
Title: Generative AI Search Engines as Arbiters of Public Knowledge: An Audit of Bias and Authority
Authors: Alice Li, Luanne Sinnamon,
Abstract summary: This paper reports on an audit study of generative AI systems (ChatGPT, Bing Chat, and Perplexity) which investigates how these new search engines construct responses. We collected system responses using a set of 48 authentic queries for 4 topics over a 7-day period and analyzed the data using sentiment analysis, inductive coding and source classification. Results provide an overview of the nature of system responses across these systems and provide evidence of sentiment bias based on the queries and topics, and commercial and geographic bias in sources.
Score: 2.860575804107195
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper reports on an audit study of generative AI systems (ChatGPT, Bing Chat, and Perplexity) which investigates how these new search engines construct responses and establish authority for topics of public importance. We collected system responses using a set of 48 authentic queries for 4 topics over a 7-day period and analyzed the data using sentiment analysis, inductive coding and source classification. Results provide an overview of the nature of system responses across these systems and provide evidence of sentiment bias based on the queries and topics, and commercial and geographic bias in sources. The quality of sources used to support claims is uneven, relying heavily on News and Media, Business and Digital Media websites. Implications for system users emphasize the need to critically examine Generative AI system outputs when making decisions related to public interest and personal well-being.

Related papers

Social and Political Framing in Search Engine Results [5.478764356647437]
This study analyzes the outputs of major search engines using a dataset of political and social topics.<n>The findings reveal that search engines prioritize content in ways that reflect underlying biases.<n>Significant differences were observed across search engines in terms of the sources they prioritize.
arXiv Detail & Related papers (2025-07-17T17:44:33Z)
News Source Citing Patterns in AI Search Systems [6.976269683687743]
We analyze data from the AI Search Arena, a head-to-head evaluation platform for AI search systems.<n>The dataset comprises over 24,000 conversations and 65,000 responses from models across three major providers: OpenAI, Perplexity, and Google.<n>We find that while models from different providers cite distinct news sources, they exhibit shared patterns in citation behavior.
arXiv Detail & Related papers (2025-07-07T02:17:57Z)
From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents [96.65646344634524]
Large Language Models (LLMs), endowed with reasoning and agentic capabilities, are ushering in a new paradigm termed Agentic Deep Research.<n>We trace the evolution from static web search to interactive, agent-based systems that plan, explore, and learn.<n>We demonstrate that Agentic Deep Research not only significantly outperforms existing approaches, but is also poised to become the dominant paradigm for future information seeking.
arXiv Detail & Related papers (2025-06-23T17:27:19Z)
Leveraging LLMs for User Stories in AI Systems: UStAI Dataset [0.38233569758620056]
Large Language Models (LLMs) are emerging as a promising alternative to human-generated text. This paper investigates the potential use of LLMs to generate user stories for AI systems based on abstracts from scholarly papers. Our analysis demonstrates that the investigated LLMs can generate user stories inspired by the needs of various stakeholders.
arXiv Detail & Related papers (2025-04-01T08:03:40Z)
Toward Agentic AI: Generative Information Retrieval Inspired Intelligent Communications and Networking [87.82985288731489]
Agentic AI has emerged as a key paradigm for intelligent communications and networking. This article emphasizes the role of knowledge acquisition, processing, and retrieval in agentic AI for telecom systems.
arXiv Detail & Related papers (2025-02-24T06:02:25Z)
Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question. We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat. We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z)
Web Retrieval Agents for Evidence-Based Misinformation Detection [12.807650005708911]
This paper develops an agent-based automated fact-checking approach for detecting misinformation. We demonstrate that combining a powerful LLM agent, which does not have access to the internet for searches, with an online web search agent yields better results than when each tool is used independently.
arXiv Detail & Related papers (2024-08-15T15:13:16Z)
Measuring and Addressing Indexical Bias in Information Retrieval [69.7897730778898]
PAIR framework supports automatic bias audits for ranked documents or entire IR systems. After introducing DUO, we run an extensive evaluation of 8 IR systems on a new corpus of 32k synthetic and 4.7k natural documents. A human behavioral study validates our approach, showing that our bias metric can help predict when and how indexical bias will shift a reader's opinion.
arXiv Detail & Related papers (2024-06-06T17:42:37Z)
ExpertQA: Expert-Curated Questions and Attributed Answers [51.68314045809179]
We conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality. We collect expert-curated questions from 484 participants across 32 fields of study, and then ask the same experts to evaluate generated responses to their own questions. The output of our analysis is ExpertQA, a high-quality long-form QA dataset with 2177 questions spanning 32 fields, along with verified answers and attributions for claims in the answers.
arXiv Detail & Related papers (2023-09-14T16:54:34Z)
Evaluating Verifiability in Generative Search Engines [70.59477647085387]
Generative search engines directly generate responses to user queries, along with in-line citations. We conduct human evaluation to audit four popular generative search engines. We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations.
arXiv Detail & Related papers (2023-04-19T17:56:12Z)
Towards Corpus-Scale Discovery of Selection Biases in News Coverage: Comparing What Sources Say About Entities as a Start [65.28355014154549]
This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora. We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
arXiv Detail & Related papers (2023-04-06T23:36:45Z)
Search-Engine-augmented Dialogue Response Generation with Cheaply Supervised Query Production [98.98161995555485]
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation. As the core module, a query producer is used to generate queries from a dialogue context to interact with a search engine. Experiments show that our query producer can achieve R@1 and R@5 rates of 62.4% and 74.8% for retrieving gold knowledge.
arXiv Detail & Related papers (2023-02-16T01:58:10Z)
"There Is Not Enough Information": On the Effects of Explanations on Perceptions of Informational Fairness and Trustworthiness in Automated Decision-Making [0.0]
Automated decision systems (ADS) are increasingly used for consequential decision-making. We conduct a human subject study to assess people's perceptions of informational fairness. A comprehensive analysis of qualitative feedback sheds light on people's desiderata for explanations.
arXiv Detail & Related papers (2022-05-11T20:06:03Z)
Proposing an Interactive Audit Pipeline for Visual Privacy Research [0.0]
We argue for the use of fairness to discover bias and fairness issues in systems, assert the need for a responsible human-over-the-loop, and reflect on the need to explore research agendas that have harmful societal impacts. Our goal is to provide a systematic analysis of the machine learning pipeline for visual privacy and bias issues.
arXiv Detail & Related papers (2021-11-07T01:51:43Z)
"Don't quote me on that": Finding Mixtures of Sources in News Articles [85.92467549469147]
We construct an ontological labeling system for sources based on each source's textitaffiliation and textitrole We build a probabilistic model to infer these attributes for named sources and to describe news articles as mixtures of these sources.
arXiv Detail & Related papers (2021-04-19T21:57:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.