Examining Racial Stereotypes in YouTube Autocomplete Suggestions
- URL: http://arxiv.org/abs/2410.03102v1
- Date: Fri, 4 Oct 2024 02:53:25 GMT
- Title: Examining Racial Stereotypes in YouTube Autocomplete Suggestions
- Authors: Eunbin Ha, Haein Kong, Shagun Jhaver,
- Abstract summary: We examine how YouTube autocompletes serve as an information source for users exploring information about race.
Using critical discourse analysis, we identify five major sociocultural contexts in which racial biases manifest.
- Score: 1.297210402524609
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Autocomplete is a popular search feature that predicts queries based on user input and guides users to a set of potentially relevant suggestions. In this study, we examine how YouTube autocompletes serve as an information source for users exploring information about race. We perform an algorithm output audit of autocomplete suggestions for input queries about four racial groups and examine the stereotypes they embody. Using critical discourse analysis, we identify five major sociocultural contexts in which racial biases manifest -- Appearance, Ability, Culture, Social Equity, and Manner. Our results show evidence of aggregated discrimination and interracial tensions in the autocompletes we collected and highlight their potential risks in othering racial minorities. We call for urgent innovations in content moderation policy design and enforcement to address these biases in search outputs.
Related papers
- A comparison of online search engine autocompletion in Google and Baidu [3.5016560416031886]
We study the characteristics of search auto-completions in two different linguistic and cultural contexts: Baidu and Google.
We find differences between the two search engines in the way they suppress or modify original queries.
Our study highlights the need for more refined, culturally sensitive moderation strategies in current language technologies.
arXiv Detail & Related papers (2024-05-03T08:17:04Z) - Sequential Decision-Making for Inline Text Autocomplete [14.83046358936405]
We study the problem of improving inline autocomplete suggestions in text entry systems.
We use reinforcement learning to learn suggestion policies through repeated interactions with a target user.
arXiv Detail & Related papers (2024-03-21T22:33:16Z) - Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona
Biases in Dialogue Systems [103.416202777731]
We study "persona biases", which we define to be the sensitivity of dialogue models' harmful behaviors contingent upon the personas they adopt.
We categorize persona biases into biases in harmful expression and harmful agreement, and establish a comprehensive evaluation framework to measure persona biases in five aspects: Offensiveness, Toxic Continuation, Regard, Stereotype Agreement, and Toxic Agreement.
arXiv Detail & Related papers (2023-10-08T21:03:18Z) - Seasonality Based Reranking of E-commerce Autocomplete Using Natural
Language Queries [15.37457156804212]
Query autocomplete (QAC) also known as typeahead, suggests list of complete queries as user types prefix in the search box.
One of the goals of typeahead is to suggest relevant queries to users which are seasonally important.
We propose a neural network based natural language processing (NLP) algorithm to incorporate seasonality as a signal.
arXiv Detail & Related papers (2023-08-03T21:14:25Z) - Evaluating Verifiability in Generative Search Engines [70.59477647085387]
Generative search engines directly generate responses to user queries, along with in-line citations.
We conduct human evaluation to audit four popular generative search engines.
We find that responses from existing generative search engines are fluent and appear informative, but frequently contain unsupported statements and inaccurate citations.
arXiv Detail & Related papers (2023-04-19T17:56:12Z) - The Matter of Chance: Auditing Web Search Results Related to the 2020
U.S. Presidential Primary Elections Across Six Search Engines [68.8204255655161]
We look at the text search results for "us elections", "donald trump", "joe biden" and "bernie sanders" queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex.
Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents.
arXiv Detail & Related papers (2021-05-03T11:18:19Z) - WordBias: An Interactive Visual Tool for Discovering Intersectional
Biases Encoded in Word Embeddings [39.87681037622605]
We present WordBias, an interactive visual tool designed to explore biases against intersectional groups encoded in word embeddings.
Given a pretrained static word embedding, WordBias computes the association of each word along different groups based on race, age, etc.
arXiv Detail & Related papers (2021-03-05T11:04:35Z) - One Label, One Billion Faces: Usage and Consistency of Racial Categories
in Computer Vision [75.82110684355979]
We study the racial system encoded by computer vision datasets supplying categorical race labels for face images.
We find that each dataset encodes a substantially unique racial system, despite nominally equivalent racial categories.
We find evidence that racial categories encode stereotypes, and exclude ethnic groups from categories on the basis of nonconformity to stereotypes.
arXiv Detail & Related papers (2021-02-03T22:50:04Z) - What Makes a Good Summary? Reconsidering the Focus of Automatic
Summarization [49.600619575148706]
We find that the current focus of the field does not fully align with participants' wishes.
Based on our findings, we argue that it is important to adopt a broader perspective on automatic summarization.
arXiv Detail & Related papers (2020-12-14T15:12:35Z) - On the Social and Technical Challenges of Web Search Autosuggestion
Moderation [118.47867428272878]
Autosuggestions are typically generated by machine learning (ML) systems trained on a corpus of search logs and document representations.
While current search engines have become increasingly proficient at suppressing such problematic suggestions, there are still persistent issues that remain.
We discuss several dimensions of problematic suggestions, difficult issues along the pipeline, and why our discussion applies to the increasing number of applications beyond web search.
arXiv Detail & Related papers (2020-07-09T19:22:00Z) - Using Noisy Self-Reports to Predict Twitter User Demographics [17.288865276460527]
We present a method to identify self-reports of race and ethnicity from Twitter profile descriptions.
Despite errors inherent in automated supervision, we produce models with good performance when measured on gold standard self-report survey data.
arXiv Detail & Related papers (2020-05-01T22:10:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.