Chinese Cyberbullying Detection: Dataset, Method, and Validation
- URL: http://arxiv.org/abs/2505.20654v1
- Date: Tue, 27 May 2025 03:03:55 GMT
- Title: Chinese Cyberbullying Detection: Dataset, Method, and Validation
- Authors: Yi Zhu, Xin Zou, Xindong Wu,
- Abstract summary: We propose a novel annotation method to construct a cyberbullying dataset that organized by incidents.<n>The constructed CHNCI is the first Chinese cyberbullying incident detection dataset, which consists of 220,676 comments in 91 incidents.<n> Experimental results demonstrate that the constructed dataset can be a benchmark for the tasks of cyberbullying detection and incident prediction.
- Score: 19.261209838897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing cyberbullying detection benchmarks were organized by the polarity of speech, such as "offensive" and "non-offensive", which were essentially hate speech detection. However, in the real world, cyberbullying often attracted widespread social attention through incidents. To address this problem, we propose a novel annotation method to construct a cyberbullying dataset that organized by incidents. The constructed CHNCI is the first Chinese cyberbullying incident detection dataset, which consists of 220,676 comments in 91 incidents. Specifically, we first combine three cyberbullying detection methods based on explanations generation as an ensemble method to generate the pseudo labels, and then let human annotators judge these labels. Then we propose the evaluation criteria for validating whether it constitutes a cyberbullying incident. Experimental results demonstrate that the constructed dataset can be a benchmark for the tasks of cyberbullying detection and incident prediction. To the best of our knowledge, this is the first study for the Chinese cyberbullying incident detection task.
Related papers
- Detecting harassment and defamation in cyberbullying with emotion-adaptive training [10.769252194833625]
cyberbullying encompasses various forms, such as denigration and harassment, which celebrities frequently face.<n>We first develop a celebrity cyberbullying dataset that encompasses two distinct types of incidents: harassment and defamation.<n>We propose an emotion-adaptive training framework (EAT) that helps transfer knowledge from the domain of emotion detection to the domain of cyberbullying detection.
arXiv Detail & Related papers (2025-01-28T13:15:07Z) - Explain Thyself Bully: Sentiment Aided Cyberbullying Detection with
Explanation [52.3781496277104]
Cyberbullying has become a big issue with the popularity of different social media networks and online communication apps.
Recent laws like "right to explanations" of General Data Protection Regulation have spurred research in developing interpretable models.
We develop first interpretable multi-task model called em mExCB for automatic cyberbullying detection from code-mixed languages.
arXiv Detail & Related papers (2024-01-17T07:36:22Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against
Fact-Verification Systems [80.3811072650087]
We show that it is possible to subtly modify claim-salient snippets in the evidence and generate diverse and claim-aligned evidence.
The attacks are also robust against post-hoc modifications of the claim.
These attacks can have harmful implications on the inspectable and human-in-the-loop usage scenarios.
arXiv Detail & Related papers (2022-09-07T13:39:24Z) - DISARM: Detecting the Victims Targeted by Harmful Memes [49.12165815990115]
DISARM is a framework that uses named entity recognition and person identification to detect harmful memes.
We show that DISARM significantly outperforms ten unimodal and multimodal systems.
It can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.
arXiv Detail & Related papers (2022-05-11T19:14:26Z) - Zero-Query Transfer Attacks on Context-Aware Object Detectors [95.18656036716972]
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results.
A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check.
We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check.
arXiv Detail & Related papers (2022-03-29T04:33:06Z) - In the Service of Online Order: Tackling Cyber-Bullying with Machine
Learning and Affect Analysis [13.092135222168324]
PTA (Parent-Teacher Association) members have started Online Patrol to spot malicious contents within Web forums and blogs.
In practise, Online Patrol assumes reading through the whole Web contents, which is a task difficult to perform manually.
We aim to develop a set of tools that can automatically detect malicious entries and report them to PTA members.
arXiv Detail & Related papers (2022-03-04T03:13:45Z) - Comparative Performance of Machine Learning Algorithms in Cyberbullying
Detection: Using Turkish Language Preprocessing Techniques [0.0]
The aim of this study is to compare the performance of different machine learning algorithms in detecting Turkish messages containing cyberbullying.
It was determined that the Light Gradient Boosting Model (LGBM) algorithm showed the best performance with 90.788% accuracy and 90.949% F1 Score value.
arXiv Detail & Related papers (2021-01-29T18:28:44Z) - Enhancing the Identification of Cyberbullying through Participant Roles [1.399948157377307]
This paper proposes a novel approach to enhancing cyberbullying detection through role modeling.
We utilise a dataset from ASKfm to perform multi-class classification to detect participant roles.
arXiv Detail & Related papers (2020-10-13T19:13:07Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.