Automated Detection of Cyberbullying Against Women and Immigrants and
Cross-domain Adaptability
- URL: http://arxiv.org/abs/2012.02565v1
- Date: Fri, 4 Dec 2020 13:12:31 GMT
- Title: Automated Detection of Cyberbullying Against Women and Immigrants and
Cross-domain Adaptability
- Authors: Thushari Atapattu, Mahen Herath, Georgia Zhang, Katrina Falkner
- Abstract summary: This paper focuses on advancing the technology using state-of-the-art NLP techniques.
We use a Twitter dataset from SemEval 2019 - Task 5(HatEval) on hate speech against women and immigrants.
Our best performing ensemble model based on DistilBERT has achieved 0.73 and 0.74 of F1 score in the task of classifying hate speech.
- Score: 2.294014185517203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cyberbullying is a prevalent and growing social problem due to the surge of
social media technology usage. Minorities, women, and adolescents are among the
common victims of cyberbullying. Despite the advancement of NLP technologies,
the automated cyberbullying detection remains challenging. This paper focuses
on advancing the technology using state-of-the-art NLP techniques. We use a
Twitter dataset from SemEval 2019 - Task 5(HatEval) on hate speech against
women and immigrants. Our best performing ensemble model based on DistilBERT
has achieved 0.73 and 0.74 of F1 score in the task of classifying hate speech
(Task A) and aggressiveness and target (Task B) respectively. We adapt the
ensemble model developed for Task A to classify offensive language in external
datasets and achieved ~0.7 of F1 score using three benchmark datasets, enabling
promising results for cross-domain adaptability. We conduct a qualitative
analysis of misclassified tweets to provide insightful recommendations for
future cyberbullying research.
Related papers
- Identifying Cyberbullying Roles in Social Media [3.5568310805420427]
It is critical to accurately detect the roles of individuals involved in cyberbullying incidents to effectively address the issue on a large scale.
This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions.
arXiv Detail & Related papers (2024-12-21T00:46:48Z) - A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities [43.37824420609252]
Hate speech online remains an understudied issue for marginalized communities.
In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet.
arXiv Detail & Related papers (2024-12-06T11:00:05Z) - Sentiment Analysis of Cyberbullying Data in Social Media [0.0]
Our work focuses on leveraging deep learning and natural language understanding techniques to detect traces of bullying in social media posts.
One approach utilizes BERT embeddings, while the other replaces the embeddings layer with the recently released embeddings API from OpenAI.
We conducted a performance comparison between these two approaches to evaluate their effectiveness in sentiment analysis of Formspring Cyberbullying data.
arXiv Detail & Related papers (2024-11-08T20:41:04Z) - ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents [49.00494558898933]
This paper describes our participation in Task 3 and Task 5 of the #SMM4H (Social Media Mining for Health) 2024 Workshop.
Task 3 is a multi-class classification task centered on tweets discussing the impact of outdoor environments on symptoms of social anxiety.
Task 5 involves a binary classification task focusing on tweets reporting medical disorders in children.
We applied transfer learning from pre-trained encoder-decoder models such as BART-base and T5-small to identify the labels of a set of given tweets.
arXiv Detail & Related papers (2024-04-30T17:06:20Z) - Explain Thyself Bully: Sentiment Aided Cyberbullying Detection with
Explanation [52.3781496277104]
Cyberbullying has become a big issue with the popularity of different social media networks and online communication apps.
Recent laws like "right to explanations" of General Data Protection Regulation have spurred research in developing interpretable models.
We develop first interpretable multi-task model called em mExCB for automatic cyberbullying detection from code-mixed languages.
arXiv Detail & Related papers (2024-01-17T07:36:22Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - AlexU-AIC at Arabic Hate Speech 2022: Contrast to Classify [2.9220076568786326]
We present our submission to the Arabic Hate Speech 2022 Shared Task Workshop (OSACT5 2022) using the associated Arabic Twitter dataset.
For offensive Tweets, sub-task B focuses on detecting whether the tweet is hate speech or not.
For hate speech Tweets, sub-task C focuses on detecting the fine-grained type of hate speech among six different classes.
arXiv Detail & Related papers (2022-07-18T12:33:51Z) - Analysing Cyberbullying using Natural Language Processing by
Understanding Jargon in Social Media [4.932130498861987]
In our work, we explore binary classification by using a combination of datasets from various social media platforms.
We experiment through multiple models such as Bi-LSTM, GloVe, state-of-the-art models like BERT, and apply a unique preprocessing technique by introducing a slang-abusive corpus.
arXiv Detail & Related papers (2021-04-23T04:20:19Z) - Enhancing the Identification of Cyberbullying through Participant Roles [1.399948157377307]
This paper proposes a novel approach to enhancing cyberbullying detection through role modeling.
We utilise a dataset from ASKfm to perform multi-class classification to detect participant roles.
arXiv Detail & Related papers (2020-10-13T19:13:07Z) - Cross-ethnicity Face Anti-spoofing Recognition Challenge: A Review [79.49390241265337]
Chalearn Face Anti-spoofing Attack Detection Challenge consists of single-modal (e.g., RGB) and multi-modal (e.g., RGB, Depth, Infrared (IR)) tracks.
This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results.
arXiv Detail & Related papers (2020-04-23T06:43:08Z) - Characterizing Speech Adversarial Examples Using Self-Attention U-Net
Enhancement [102.48582597586233]
We present a U-Net based attention model, U-Net$_At$, to enhance adversarial speech signals.
We conduct experiments on the automatic speech recognition (ASR) task with adversarial audio attacks.
arXiv Detail & Related papers (2020-03-31T02:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.