Towards a comprehensive taxonomy of online abusive language informed by machine leaning
- URL: http://arxiv.org/abs/2504.17653v1
- Date: Thu, 24 Apr 2025 15:23:47 GMT
- Title: Towards a comprehensive taxonomy of online abusive language informed by machine leaning
- Authors: Samaneh Hosseini Moghaddam, Kelly Lyons, Cheryl Regehr, Vivek Goel, Kaitlyn Regehr,
- Abstract summary: This paper presents a taxonomy for distinguishing key characteristics of abusive language within online text.<n>It classifies various facets of online abuse, including context, target, intensity, directness, and theme of abuse.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The proliferation of abusive language in online communications has posed significant risks to the health and wellbeing of individuals and communities. The growing concern regarding online abuse and its consequences necessitates methods for identifying and mitigating harmful content and facilitating continuous monitoring, moderation, and early intervention. This paper presents a taxonomy for distinguishing key characteristics of abusive language within online text. Our approach uses a systematic method for taxonomy development, integrating classification systems of 18 existing multi-label datasets to capture key characteristics relevant to online abusive language classification. The resulting taxonomy is hierarchical and faceted, comprising 5 categories and 17 dimensions. It classifies various facets of online abuse, including context, target, intensity, directness, and theme of abuse. This shared understanding can lead to more cohesive efforts, facilitate knowledge exchange, and accelerate progress in the field of online abuse detection and mitigation among researchers, policy makers, online platform owners, and other stakeholders.
Related papers
- A survey of textual cyber abuse detection using cutting-edge language models and large language models [0.0]
We present a comprehensive analysis of the different forms of abuse prevalent in social media.<n>We focus on how emerging technologies, such as Language Models (LMs) and Large Language Models (LLMs) are reshaping both the detection and generation of abusive content.
arXiv Detail & Related papers (2025-01-09T18:55:50Z) - Demarked: A Strategy for Enhanced Abusive Speech Moderation through Counterspeech, Detoxification, and Message Management [71.99446449877038]
We propose a more comprehensive approach called Demarcation scoring abusive speech based on four aspect -- (i) severity scale; (ii) presence of a target; (iii) context scale; (iv) legal scale.
Our work aims to inform future strategies for effectively addressing abusive speech online.
arXiv Detail & Related papers (2024-06-27T21:45:33Z) - CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics [49.2719253711215]
This study introduces a novel approach to disaster text classification by enhancing a pre-trained Large Language Model (LLM)<n>Our methodology involves creating a comprehensive instruction dataset from disaster-related tweets, which is then used to fine-tune an open-source LLM.<n>This fine-tuned model can classify multiple aspects of disaster-related information simultaneously, such as the type of event, informativeness, and involvement of human aid.
arXiv Detail & Related papers (2024-06-16T23:01:10Z) - The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content [2.2618341648062477]
This paper examines the role of intent in content moderation systems.
We review state of the art detection models and benchmark training datasets for online abuse to assess their awareness and ability to capture intent.
arXiv Detail & Related papers (2024-05-17T18:05:13Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Learning to Adapt Domain Shifts of Moral Values via Instance Weighting [74.94940334628632]
Classifying moral values in user-generated text from social media is critical to understanding community cultures.
Moral values and language usage can change across the social movements.
We propose a neural adaptation framework via instance weighting to improve cross-domain classification tasks.
arXiv Detail & Related papers (2022-04-15T18:15:41Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - Fragments of the Past: Curating Peer Support with Perpetrators of
Domestic Violence [88.37416552778178]
We report on a ten-month study where we worked with six support workers and eighteen perpetrators in the design and deployment of Fragments of the Past.
We share how crafting digitally-augmented artefacts - 'fragments' - of experiences of desisting from violence can translate messages for motivation and rapport between peers.
These insights provide the basis for practical considerations for future network design with challenging populations.
arXiv Detail & Related papers (2021-07-09T22:57:43Z) - The User behind the Abuse: A Position on Ethics and Explainability [25.791014642037585]
We discuss the role that modeling of users and online communities plays in abuse detection.
We then explore the ethical challenges of incorporating user and community information.
We propose properties that an explainable method should aim to exhibit.
arXiv Detail & Related papers (2021-03-31T16:20:37Z) - Towards Ethics by Design in Online Abusive Content Detection [7.163723138100273]
The research effort has spread out across several closely related sub-areas, such as detection of hate speech, toxicity, cyberbullying, etc.
We bring ethical issues to forefront and propose a unified framework as a two-step process.
The novel framework is guided by the Ethics by Design principle and is a step towards building more accurate and trusted models.
arXiv Detail & Related papers (2020-10-28T13:10:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.