Multilingual Content Moderation: A Case Study on Reddit
- URL: http://arxiv.org/abs/2302.09618v1
- Date: Sun, 19 Feb 2023 16:36:33 GMT
- Title: Multilingual Content Moderation: A Case Study on Reddit
- Authors: Meng Ye, Karan Sikka, Katherine Atwell, Sabit Hassan, Ajay Divakaran,
Malihe Alikhani
- Abstract summary: We propose to study the challenges of content moderation by introducing a multilingual dataset of 1.8 million Reddit comments.
We perform extensive experimental analysis to highlight the underlying challenges and suggest related research problems.
Our dataset and analysis can help better prepare for the challenges and opportunities of auto moderation.
- Score: 23.949429463013796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Content moderation is the process of flagging content based on pre-defined
platform rules. There has been a growing need for AI moderators to safeguard
users as well as protect the mental health of human moderators from traumatic
content. While prior works have focused on identifying hateful/offensive
language, they are not adequate for meeting the challenges of content
moderation since 1) moderation decisions are based on violation of rules, which
subsumes detection of offensive speech, and 2) such rules often differ across
communities which entails an adaptive solution. We propose to study the
challenges of content moderation by introducing a multilingual dataset of 1.8
Million Reddit comments spanning 56 subreddits in English, German, Spanish and
French. We perform extensive experimental analysis to highlight the underlying
challenges and suggest related research problems such as cross-lingual
transfer, learning under label noise (human biases), transfer of moderation
models, and predicting the violated rule. Our dataset and analysis can help
better prepare for the challenges and opportunities of auto moderation.
Related papers
- Venire: A Machine Learning-Guided Panel Review System for Community Content Moderation [17.673993032146527]
We develop Venire, an ML-backed system for panel review on Reddit.
Venire uses a machine learning model trained on log data to identify the cases where moderators are most likely to disagree.
We show that Venire is able to improve decision consistency and surface latent disagreements.
arXiv Detail & Related papers (2024-10-30T20:39:34Z) - Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster [72.84926097773578]
We investigate the effect of explanations on the speed of real-world moderators.
Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.
arXiv Detail & Related papers (2024-06-06T14:23:10Z) - Algorithmic Arbitrariness in Content Moderation [1.4849645397321183]
We show how content moderation tools can arbitrarily classify samples as toxic.
We discuss these findings in terms of human rights set out by the International Covenant on Civil and Political Rights (ICCPR)
Our study underscores the need to identify and increase the transparency of arbitrariness in content moderation applications.
arXiv Detail & Related papers (2024-02-26T19:27:00Z) - SADAS: A Dialogue Assistant System Towards Remediating Norm Violations
in Bilingual Socio-Cultural Conversations [56.31816995795216]
Socially-Aware Dialogue Assistant System (SADAS) is designed to ensure that conversations unfold with respect and understanding.
Our system's novel architecture includes: (1) identifying the categories of norms present in the dialogue, (2) detecting potential norm violations, (3) evaluating the severity of these violations, and (4) implementing targeted remedies to rectify the breaches.
arXiv Detail & Related papers (2024-01-29T08:54:21Z) - Content Moderation on Social Media in the EU: Insights From the DSA
Transparency Database [0.0]
Digital Services Act (DSA) requires large social media platforms in the EU to provide clear and specific information whenever they restrict access to certain content.
Statements of Reasons (SoRs) are collected in the DSA Transparency Database to ensure transparency and scrutiny of content moderation decisions.
We empirically analyze 156 million SoRs within an observation period of two months to provide an early look at content moderation decisions of social media platforms in the EU.
arXiv Detail & Related papers (2023-12-07T16:56:19Z) - Why Should This Article Be Deleted? Transparent Stance Detection in
Multilingual Wikipedia Editor Discussions [47.944081120226905]
We construct a novel dataset of Wikipedia editor discussions along with their reasoning in three languages.
The dataset contains the stances of the editors (keep, delete, merge, comment), along with the stated reason, and a content moderation policy, for each edit decision.
We demonstrate that stance and corresponding reason (policy) can be predicted jointly with a high degree of accuracy, adding transparency to the decision-making process.
arXiv Detail & Related papers (2023-10-09T15:11:02Z) - User Attitudes to Content Moderation in Web Search [49.1574468325115]
We examine the levels of support for different moderation practices applied to potentially misleading and/or potentially offensive content in web search.
We find that the most supported practice is informing users about potentially misleading or offensive content, and the least supported one is the complete removal of search results.
More conservative users and users with lower levels of trust in web search results are more likely to be against content moderation in web search.
arXiv Detail & Related papers (2023-10-05T10:57:15Z) - Analyzing Norm Violations in Live-Stream Chat [49.120561596550395]
We study the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms.
We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch.
Our results show that appropriate contextual information can boost moderation performance by 35%.
arXiv Detail & Related papers (2023-05-18T05:58:27Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Moderation Challenges in Voice-based Online Communities on Discord [24.417653462255448]
Findings suggest that the affordances of voice-based online communities change what it means to moderate content and interactions.
New ways to break rules that moderators of text-based communities find unfamiliar, such as disruptive noise and voice raiding.
New moderation strategies are limited and often based on hearsay and first impressions, resulting in problems ranging from unsuccessful moderation to false accusations.
arXiv Detail & Related papers (2021-01-13T18:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.