Generative AI Search Engines as Arbiters of Public Knowledge: An Audit of Bias and Authority
- URL: http://arxiv.org/abs/2405.14034v1
- Date: Wed, 22 May 2024 22:09:32 GMT
- Title: Generative AI Search Engines as Arbiters of Public Knowledge: An Audit of Bias and Authority
- Authors: Alice Li, Luanne Sinnamon,
- Abstract summary: This paper reports on an audit study of generative AI systems (ChatGPT, Bing Chat, and Perplexity) which investigates how these new search engines construct responses.
We collected system responses using a set of 48 authentic queries for 4 topics over a 7-day period and analyzed the data using sentiment analysis, inductive coding and source classification.
Results provide an overview of the nature of system responses across these systems and provide evidence of sentiment bias based on the queries and topics, and commercial and geographic bias in sources.
- Score: 2.860575804107195
- License:
- Abstract: This paper reports on an audit study of generative AI systems (ChatGPT, Bing Chat, and Perplexity) which investigates how these new search engines construct responses and establish authority for topics of public importance. We collected system responses using a set of 48 authentic queries for 4 topics over a 7-day period and analyzed the data using sentiment analysis, inductive coding and source classification. Results provide an overview of the nature of system responses across these systems and provide evidence of sentiment bias based on the queries and topics, and commercial and geographic bias in sources. The quality of sources used to support claims is uneven, relying heavily on News and Media, Business and Digital Media websites. Implications for system users emphasize the need to critically examine Generative AI system outputs when making decisions related to public interest and personal well-being.
Related papers
- Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question.
We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat.
We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z) - Web Retrieval Agents for Evidence-Based Misinformation Detection [12.807650005708911]
This paper develops an agent-based automated fact-checking approach for detecting misinformation.
We demonstrate that combining a powerful LLM agent, which does not have access to the internet for searches, with an online web search agent yields better results than when each tool is used independently.
arXiv Detail & Related papers (2024-08-15T15:13:16Z) - Measuring and Addressing Indexical Bias in Information Retrieval [69.7897730778898]
PAIR framework supports automatic bias audits for ranked documents or entire IR systems.
After introducing DUO, we run an extensive evaluation of 8 IR systems on a new corpus of 32k synthetic and 4.7k natural documents.
A human behavioral study validates our approach, showing that our bias metric can help predict when and how indexical bias will shift a reader's opinion.
arXiv Detail & Related papers (2024-06-06T17:42:37Z) - ExpertQA: Expert-Curated Questions and Attributed Answers [51.68314045809179]
We conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality.
We collect expert-curated questions from 484 participants across 32 fields of study, and then ask the same experts to evaluate generated responses to their own questions.
The output of our analysis is ExpertQA, a high-quality long-form QA dataset with 2177 questions spanning 32 fields, along with verified answers and attributions for claims in the answers.
arXiv Detail & Related papers (2023-09-14T16:54:34Z) - Evaluating Verifiability in Generative Search Engines [70.59477647085387]
Generative search engines directly generate responses to user queries, along with in-line citations.
We conduct human evaluation to audit four popular generative search engines.
We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations.
arXiv Detail & Related papers (2023-04-19T17:56:12Z) - Towards Corpus-Scale Discovery of Selection Biases in News Coverage:
Comparing What Sources Say About Entities as a Start [65.28355014154549]
This paper investigates the challenges of building scalable NLP systems for discovering patterns of media selection biases directly from news content in massive-scale news corpora.
We show the capabilities of the framework through a case study on NELA-2020, a corpus of 1.8M news articles in English from 519 news sources worldwide.
arXiv Detail & Related papers (2023-04-06T23:36:45Z) - Search-Engine-augmented Dialogue Response Generation with Cheaply
Supervised Query Production [98.98161995555485]
We propose a dialogue model that can access the vast and dynamic information from any search engine for response generation.
As the core module, a query producer is used to generate queries from a dialogue context to interact with a search engine.
Experiments show that our query producer can achieve R@1 and R@5 rates of 62.4% and 74.8% for retrieving gold knowledge.
arXiv Detail & Related papers (2023-02-16T01:58:10Z) - "There Is Not Enough Information": On the Effects of Explanations on
Perceptions of Informational Fairness and Trustworthiness in Automated
Decision-Making [0.0]
Automated decision systems (ADS) are increasingly used for consequential decision-making.
We conduct a human subject study to assess people's perceptions of informational fairness.
A comprehensive analysis of qualitative feedback sheds light on people's desiderata for explanations.
arXiv Detail & Related papers (2022-05-11T20:06:03Z) - Proposing an Interactive Audit Pipeline for Visual Privacy Research [0.0]
We argue for the use of fairness to discover bias and fairness issues in systems, assert the need for a responsible human-over-the-loop, and reflect on the need to explore research agendas that have harmful societal impacts.
Our goal is to provide a systematic analysis of the machine learning pipeline for visual privacy and bias issues.
arXiv Detail & Related papers (2021-11-07T01:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.