Related papers: Investigating Bias in Political Search Query Suggestions by Relative Comparison with LLMs

Investigating Bias in Political Search Query Suggestions by Relative Comparison with LLMs

URL: http://arxiv.org/abs/2410.23879v1
Date: Thu, 31 Oct 2024 12:40:38 GMT
Title: Investigating Bias in Political Search Query Suggestions by Relative Comparison with LLMs
Authors: Fabian Haak, Björn Engelmann, Christin Katharina Kreutz, Philipp Schaer,
Abstract summary: bias in search query suggestions can lead to exposure to biased search results and can impact opinion formation. We use a multi-step approach to identify and quantify bias in English search query suggestions. We apply our approach to the U.S. political news domain and compare bias in Google and Bing.
Score: 1.5356574175312299
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Search query suggestions affect users' interactions with search engines, which then influences the information they encounter. Thus, bias in search query suggestions can lead to exposure to biased search results and can impact opinion formation. This is especially critical in the political domain. Detecting and quantifying bias in web search engines is difficult due to its topic dependency, complexity, and subjectivity. The lack of context and phrasality of query suggestions emphasizes this problem. In a multi-step approach, we combine the benefits of large language models, pairwise comparison, and Elo-based scoring to identify and quantify bias in English search query suggestions. We apply our approach to the U.S. political news domain and compare bias in Google and Bing.

Related papers

Auditing Google's Search Algorithm: Measuring News Diversity Across Brazil, the UK, and the US [0.0]
This study examines the influence of Google's search algorithm on news diversity by analyzing search results in Brazil, the UK, and the US. It explores how Google's system preferentially favors a limited number of news outlets. Findings indicate a slight leftward bias in search outcomes and a preference for popular, often national outlets.
arXiv Detail & Related papers (2024-10-31T11:49:16Z)
Overview of PerpectiveArg2024: The First Shared Task on Perspective Argument Retrieval [56.66761232081188]
We present a novel dataset covering demographic and socio-cultural (socio) variables, such as age, gender, and political attitude, representing minority and majority groups in society. We find substantial challenges in incorporating perspectivism, especially when aiming for personalization based solely on the text of arguments without explicitly providing socio profiles. While we bootstrap perspective argument retrieval, further research is essential to optimize retrieval systems to facilitate personalization and reduce polarization.
arXiv Detail & Related papers (2024-07-29T03:14:57Z)
Fairness and Bias in Multimodal AI: A Survey [0.20971479389679337]
The importance of addressing fairness and bias in artificial intelligence (AI) systems cannot be over-emphasized. We fill a gap with regards to the relatively minimal study of fairness and bias in Large Multimodal Models (LMMs) compared to Large Language Models (LLMs) We provide 50 examples of datasets and models related to both types of AI along with the challenges of bias affecting them.
arXiv Detail & Related papers (2024-06-27T11:26:17Z)
CLARINET: Augmenting Language Models to Ask Clarification Questions for Retrieval [52.134133938779776]
We present CLARINET, a system that asks informative clarification questions by choosing questions whose answers would maximize certainty in the correct candidate. Our approach works by augmenting a large language model (LLM) to condition on a retrieval distribution, finetuning end-to-end to generate the question that would have maximized the rank of the true candidate at each turn.
arXiv Detail & Related papers (2024-04-28T18:21:31Z)
What Evidence Do Language Models Find Convincing? [94.90663008214918]
We build a dataset that pairs controversial queries with a series of real-world evidence documents that contain different facts. We use this dataset to perform sensitivity and counterfactual analyses to explore which text features most affect LLM predictions. Overall, we find that current models rely heavily on the relevance of a website to the query, while largely ignoring stylistic features that humans find important.
arXiv Detail & Related papers (2024-02-19T02:15:34Z)
Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking [49.02867094432589]
Large language models (LLMs) powered conversational search systems have already been used by hundreds of millions of people. We investigate whether and how LLMs with opinion biases that either reinforce or challenge the user's view change the effect.
arXiv Detail & Related papers (2024-02-08T18:14:33Z)
Navigating the Thin Line: Examining User Behavior in Search to Detect Engagement and Backfire Effects [0.0]
We investigate whether different levels of bias metrics and search results presentation can affect the stance diversity consumption and search behavior of opinionated users. Our results show that exposing participants to (counter-attitudinally) biased search results increases their consumption of attitude-opposing content. We also found that bias was associated with a trend toward overall fewer interactions within the search page.
arXiv Detail & Related papers (2024-01-20T10:28:25Z)
Examining bias perpetuation in academic search engines: an algorithm audit of Google and Semantic Scholar [0.0]
This study examines whether confirmation biased queries prompted into Google Scholar will yield results aligned with a query's bias. Technology-related queries displaying more significant disparities. Academic search results that perpetuate confirmation bias have strong implications for both researchers and citizens searching for evidence.
arXiv Detail & Related papers (2023-11-16T15:43:31Z)
Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context. Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR. For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z)
Analysing Mixed Initiatives and Search Strategies during Conversational Search [31.63357369175702]
We present a model for conversational search -- from which we instantiate different observed conversational search strategies, where the agent elicits: (i) Feedback-First, or (ii) Feedback-After. Our analysis reveals that there is no superior or dominant combination, instead it shows that query clarifications are better when asked first, while query suggestions are better when asked after presenting results.
arXiv Detail & Related papers (2021-09-13T13:30:10Z)
The Matter of Chance: Auditing Web Search Results Related to the 2020 U.S. Presidential Primary Elections Across Six Search Engines [68.8204255655161]
We look at the text search results for "us elections", "donald trump", "joe biden" and "bernie sanders" queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents.
arXiv Detail & Related papers (2021-05-03T11:18:19Z)
Discovering and Categorising Language Biases in Reddit [5.670038395203354]
This paper proposes a data-driven approach to automatically discover language biases encoded in the vocabulary of online discourse communities on Reddit. We use word embeddings to transform text into high-dimensional dense vectors and capture semantic relations between words. We successfully discover gender bias, religion bias, and ethnic bias in different Reddit communities.
arXiv Detail & Related papers (2020-08-06T16:42:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.