Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms
- URL: http://arxiv.org/abs/2412.19928v1
- Date: Fri, 27 Dec 2024 21:22:28 GMT
- Title: Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms
- Authors: Adamu Gaston Philipo, Doreen Sebastian Sarwatt, Jianguo Ding, Mahmoud Daneshmand, Huansheng Ning,
- Abstract summary: This research aims to adapt and evaluate existing text classification techniques within the cyberbullying detection domain.
It focuses on leveraging and assessing large language models, including BERT, RoBERTa, XLNet, DistilBERT, and GPT-2.0.
The results show that BERT strikes a balance between performance, time efficiency, and computational resources.
- Score: 3.235558067839701
- License:
- Abstract: Cyberbullying significantly contributes to mental health issues in communities by negatively impacting the psychology of victims. It is a prevalent problem on social media platforms, necessitating effective, real-time detection and monitoring systems to identify harmful messages. However, current cyberbullying detection systems face challenges related to performance, dataset quality, time efficiency, and computational costs. This research aims to conduct a comparative study by adapting and evaluating existing text classification techniques within the cyberbullying detection domain. The study specifically evaluates the effectiveness and performance of these techniques in identifying cyberbullying instances on social media platforms. It focuses on leveraging and assessing large language models, including BERT, RoBERTa, XLNet, DistilBERT, and GPT-2.0, for their suitability in this domain. The results show that BERT strikes a balance between performance, time efficiency, and computational resources: Accuracy of 95%, Precision of 95%, Recall of 95%, F1 Score of 95%, Error Rate of 5%, Inference Time of 0.053 seconds, RAM Usage of 35.28 MB, CPU/GPU Usage of 0.4%, and Energy Consumption of 0.000263 kWh. The findings demonstrate that generative AI models, while powerful, do not consistently outperform fine-tuned models on the tested benchmarks. However, state-of-the-art performance can still be achieved through strategic adaptation and fine-tuning of existing models for specific datasets and tasks.
Related papers
- Identifying Cyberbullying Roles in Social Media [3.5568310805420427]
It is critical to accurately detect the roles of individuals involved in cyberbullying incidents to effectively address the issue on a large scale.
This study explores the use of machine learning models to detect the roles involved in cyberbullying interactions.
arXiv Detail & Related papers (2024-12-21T00:46:48Z) - Scalable and Effective Negative Sample Generation for Hyperedge Prediction [55.9298019975967]
Hyperedge prediction is crucial for understanding complex multi-entity interactions in web-based applications.
Traditional methods often face difficulties in generating high-quality negative samples due to imbalance between positive and negative instances.
We present the scalable and effective negative sample generation for Hyperedge Prediction (SEHP) framework, which utilizes diffusion models to tackle these challenges.
arXiv Detail & Related papers (2024-11-19T09:16:25Z) - The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities.
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models.
Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z) - Optimizing Transformer based on high-performance optimizer for predicting employment sentiment in American social media content [9.49688045612671]
This article improves the Transformer model based on swarm intelligence optimization algorithm, aiming to predict the emotions of employment related text content on American social media.
During the training process, the accuracy of the model gradually increased from 49.27% to 82.83%, while the loss value decreased from 0.67 to 0.35.
The improved model proposed in this article not only improves the accuracy of sentiment recognition in employment related texts on social media, but also has important practical significance.
arXiv Detail & Related papers (2024-10-09T03:14:05Z) - How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models [95.44559524735308]
Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content.
We test the limits of improving foundation model performance without continual updating through an initial study of knowledge transfer.
Our results on two recent multi-modal fact-checking benchmarks, Mocheg and Fakeddit, indicate that knowledge transfer strategies can improve Fakeddit performance over the state-of-the-art by up to 1.7% and Mocheg performance by up to 2.9%.
arXiv Detail & Related papers (2024-06-29T08:39:07Z) - Deep Learning Approaches for Detecting Adversarial Cyberbullying and Hate Speech in Social Networks [0.0]
This paper focuses on detecting cyberbullying in adversarial attack content within social networking site text data, specifically emphasizing hate speech.
An LSTM model with a fixed epoch of 100 demonstrated remarkable performance, achieving high accuracy, precision, recall, F1-score, and AUC-ROC scores of 87.57%, 88.73%, 87.57%, 88.15%, and 91% respectively.
arXiv Detail & Related papers (2024-05-30T21:44:15Z) - One-Shot Learning for Periocular Recognition: Exploring the Effect of
Domain Adaptation and Data Bias on Deep Representations [59.17685450892182]
We investigate the behavior of deep representations in widely used CNN models under extreme data scarcity for One-Shot periocular recognition.
We improved state-of-the-art results that made use of networks trained with biometric datasets with millions of images.
Traditional algorithms like SIFT can outperform CNNs in situations with limited data.
arXiv Detail & Related papers (2023-07-11T09:10:16Z) - Robust Trajectory Prediction against Adversarial Attacks [84.10405251683713]
Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving systems.
These methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions.
In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks.
arXiv Detail & Related papers (2022-07-29T22:35:05Z) - DAPPER: Label-Free Performance Estimation after Personalization for
Heterogeneous Mobile Sensing [95.18236298557721]
We present DAPPER (Domain AdaPtation Performance EstimatoR) that estimates the adaptation performance in a target domain with unlabeled target data.
Our evaluation with four real-world sensing datasets compared against six baselines shows that DAPPER outperforms the state-of-the-art baseline by 39.8% in estimation accuracy.
arXiv Detail & Related papers (2021-11-22T08:49:33Z) - Bayesian Active Learning for Wearable Stress and Affect Detection [0.7106986689736827]
Stress detection using on-device deep learning algorithms has been on the rise owing to advancements in pervasive computing.
In this paper, we propose a framework with capabilities to represent model uncertainties through approximations in Bayesian Neural Networks.
Our proposed framework achieves a considerable efficiency boost during inference, with a substantially low number of acquired pool points.
arXiv Detail & Related papers (2020-12-04T16:19:37Z) - Multi-Stage Optimized Machine Learning Framework for Network Intrusion
Detection [8.26773636337474]
This paper proposes a novel multi-stage optimized ML-based NIDS framework.
It reduces computational complexity while maintaining its detection performance.
The proposed framework significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%)
arXiv Detail & Related papers (2020-08-09T03:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.