Related papers: Analysing Cyberbullying using Natural Language Processing by Understanding Jargon in Social Media

Analysing Cyberbullying using Natural Language Processing by Understanding Jargon in Social Media

URL: http://arxiv.org/abs/2107.08902v1
Date: Fri, 23 Apr 2021 04:20:19 GMT
Title: Analysing Cyberbullying using Natural Language Processing by Understanding Jargon in Social Media
Authors: Bhumika Bhatia, Anuj Verma, Anjum, Rahul Katarya
Abstract summary: In our work, we explore binary classification by using a combination of datasets from various social media platforms. We experiment through multiple models such as Bi-LSTM, GloVe, state-of-the-art models like BERT, and apply a unique preprocessing technique by introducing a slang-abusive corpus.
Score: 4.932130498861987
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Cyberbullying is of extreme prevalence today. Online-hate comments, toxicity, cyberbullying amongst children and other vulnerable groups are only growing over online classes, and increased access to social platforms, especially post COVID-19. It is paramount to detect and ensure minors' safety across social platforms so that any violence or hate-crime is automatically detected and strict action is taken against it. In our work, we explore binary classification by using a combination of datasets from various social media platforms that cover a wide range of cyberbullying such as sexism, racism, abusive, and hate-speech. We experiment through multiple models such as Bi-LSTM, GloVe, state-of-the-art models like BERT, and apply a unique preprocessing technique by introducing a slang-abusive corpus, achieving a higher precision in comparison to models without slang preprocessing.

Related papers

AI Enabled User-Specific Cyberbullying Severity Detection with Explainability [1.949927790632678]
We propose an AI model intregrating user-specific attributes, including psychological factors (self-esteem, anxiety, depression), online behavior (internet usage, disciplinary history), and demographic attributes (race, gender, ethnicity), along with social media comments. Our LSTM model is trained using 146 features, incorporating emotional, topical, and word2vec representations of social media comments as well as user-level attributes. Our findings reveal that, beyond hate comments, victims belonging to specific racial and gender groups are more frequently targeted and exhibit higher incidences of depression, disciplinary issues, and low self-esteem.
arXiv Detail & Related papers (2025-03-04T05:11:42Z)
Detecting harassment and defamation in cyberbullying with emotion-adaptive training [10.769252194833625]
cyberbullying encompasses various forms, such as denigration and harassment, which celebrities frequently face. We first develop a celebrity cyberbullying dataset that encompasses two distinct types of incidents: harassment and defamation. We propose an emotion-adaptive training framework (EAT) that helps transfer knowledge from the domain of emotion detection to the domain of cyberbullying detection.
arXiv Detail & Related papers (2025-01-28T13:15:07Z)
Sentiment Analysis of Cyberbullying Data in Social Media [0.0]
Our work focuses on leveraging deep learning and natural language understanding techniques to detect traces of bullying in social media posts. One approach utilizes BERT embeddings, while the other replaces the embeddings layer with the recently released embeddings API from OpenAI. We conducted a performance comparison between these two approaches to evaluate their effectiveness in sentiment analysis of Formspring Cyberbullying data.
arXiv Detail & Related papers (2024-11-08T20:41:04Z)
The Use of a Large Language Model for Cyberbullying Detection [0.0]
cyberbullying (CB) is the most prevalent phenomenon in todays cyber world. It is a severe threat to the mental and physical health of citizens. This opens the need to develop a robust system to prevent bullying content from online forums, blogs, and social media platforms.
arXiv Detail & Related papers (2024-02-06T15:46:31Z)
Explain Thyself Bully: Sentiment Aided Cyberbullying Detection with Explanation [52.3781496277104]
Cyberbullying has become a big issue with the popularity of different social media networks and online communication apps. Recent laws like "right to explanations" of General Data Protection Regulation have spurred research in developing interpretable models. We develop first interpretable multi-task model called em mExCB for automatic cyberbullying detection from code-mixed languages.
arXiv Detail & Related papers (2024-01-17T07:36:22Z)
Deep Learning Based Cyberbullying Detection in Bangla Language [0.0]
This study demonstrates a deep learning strategy for identifying cyberbullying in Bengali. A two-layer bidirectional long short-term memory (Bi-LSTM) model has been built to identify cyberbullying.
arXiv Detail & Related papers (2024-01-07T04:58:59Z)
Understanding writing style in social media with a supervised contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation. We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts. Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z)
Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models. We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks. Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z)
Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Detecting Harmful Content On Online Platforms: What Platforms Need Vs. Where Research Efforts Go [44.774035806004214]
harmful content on online platforms comes in many different forms including hate speech, offensive language, bullying and harassment, misinformation, spam, violence, graphic content, sexual abuse, self harm, and many other. Online platforms seek to moderate such content to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. There is currently a dichotomy between what types of harmful content online platforms seek to curb, and what research efforts there are to automatically detect such content.
arXiv Detail & Related papers (2021-02-27T08:01:10Z)
Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining Representations for Cyberbullying Classification [4.945634077636197]
We study the nuanced problem of cyberbullying using five explicit factors to represent its social and linguistic aspects. These results demonstrate the importance of representing and modeling cyberbullying as a social phenomenon.
arXiv Detail & Related papers (2020-04-04T00:35:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.