DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship
- URL: http://arxiv.org/abs/2411.01844v1
- Date: Mon, 04 Nov 2024 06:38:43 GMT
- Title: DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship
- Authors: Yaqiong Li, Peng Zhang, Hansu Gu, Tun Lu, Siyuan Qiao, Yubo Shu, Yiyang Shao, Ning Gu,
- Abstract summary: This study investigates people's diverse needs in toxicity censorship and builds a ChatGPT-based censorship tool named DeMod accordingly.
DeMod is equipped with the features of explainable Detection and personalized Modification, providing fine-grained detection results, detailed explanations, and personalized modification suggestions.
The results suggest DeMod's multiple strengths like the richness of functionality, the accuracy of censorship, and ease of use.
- Score: 16.55929590079875
- License:
- Abstract: Although there have been automated approaches and tools supporting toxicity censorship for social posts, most of them focus on detection. Toxicity censorship is a complex process, wherein detection is just an initial task and a user can have further needs such as rationale understanding and content modification. For this problem, we conduct a needfinding study to investigate people's diverse needs in toxicity censorship and then build a ChatGPT-based censorship tool named DeMod accordingly. DeMod is equipped with the features of explainable Detection and personalized Modification, providing fine-grained detection results, detailed explanations, and personalized modification suggestions. We also implemented the tool and recruited 35 Weibo users for evaluation. The results suggest DeMod's multiple strengths like the richness of functionality, the accuracy of censorship, and ease of use. Based on the findings, we further propose several insights into the design of content censorship systems.
Related papers
- Understanding Routing-Induced Censorship Changes Globally [5.79183660559872]
We investigate the extent to which Equal-cost Multi-path (ECMP) routing is the cause for inconsistencies in censorship results.
We find ECMP routing significantly changes observed censorship across protocols, censor mechanisms, and in 17 countries.
Our work points to methods for improving future studies, reducing inconsistencies and increasing repeatability.
arXiv Detail & Related papers (2024-06-27T16:21:31Z) - Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster [72.84926097773578]
We investigate the effect of explanations on the speed of real-world moderators.
Our experiments show that while generic explanations do not affect their speed and are often ignored, structured explanations lower moderators' decision making time by 7.4%.
arXiv Detail & Related papers (2024-06-06T14:23:10Z) - User Attitudes to Content Moderation in Web Search [49.1574468325115]
We examine the levels of support for different moderation practices applied to potentially misleading and/or potentially offensive content in web search.
We find that the most supported practice is informing users about potentially misleading or offensive content, and the least supported one is the complete removal of search results.
More conservative users and users with lower levels of trust in web search results are more likely to be against content moderation in web search.
arXiv Detail & Related papers (2023-10-05T10:57:15Z) - Depression detection in social media posts using affective and social
norm features [84.12658971655253]
We propose a deep architecture for depression detection from social media posts.
We incorporate profanity and morality features of posts and words in our architecture using a late fusion scheme.
The inclusion of the proposed features yields state-of-the-art results in both settings.
arXiv Detail & Related papers (2023-03-24T21:26:27Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Multilingual Content Moderation: A Case Study on Reddit [23.949429463013796]
We propose to study the challenges of content moderation by introducing a multilingual dataset of 1.8 million Reddit comments.
We perform extensive experimental analysis to highlight the underlying challenges and suggest related research problems.
Our dataset and analysis can help better prepare for the challenges and opportunities of auto moderation.
arXiv Detail & Related papers (2023-02-19T16:36:33Z) - Augmenting Rule-based DNS Censorship Detection at Scale with Machine
Learning [38.00013408742201]
Censorship of the domain name system (DNS) is a key mechanism used across different countries.
In this paper, we explore how machine learning (ML) models can help streamline the detection process.
We find that unsupervised models, trained solely on uncensored instances, can identify new instances and variations of censorship missed by existing probes.
arXiv Detail & Related papers (2023-02-03T23:36:30Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - How We Express Ourselves Freely: Censorship, Self-censorship, and
Anti-censorship on a Chinese Social Media [4.408128846525362]
We identify the metrics of censorship and self-censorship, find the influence factors, and construct a mediation model to measure their relationship.
Based on these findings, we discuss implications for democratic social media design and future censorship research.
arXiv Detail & Related papers (2022-11-24T18:28:16Z) - On the Social and Technical Challenges of Web Search Autosuggestion
Moderation [118.47867428272878]
Autosuggestions are typically generated by machine learning (ML) systems trained on a corpus of search logs and document representations.
While current search engines have become increasingly proficient at suppressing such problematic suggestions, there are still persistent issues that remain.
We discuss several dimensions of problematic suggestions, difficult issues along the pipeline, and why our discussion applies to the increasing number of applications beyond web search.
arXiv Detail & Related papers (2020-07-09T19:22:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.