Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual
Predatory Chats and Abusive Texts
- URL: http://arxiv.org/abs/2308.14683v1
- Date: Mon, 28 Aug 2023 16:18:50 GMT
- Title: Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual
Predatory Chats and Abusive Texts
- Authors: Thanh Thi Nguyen, Campbell Wilson, Janis Dalins
- Abstract summary: This paper proposes an approach to detection of online sexual predatory chats and abusive language using the open-source pretrained Llama 2 7B- parameter model.
We fine-tune the LLM using datasets with different sizes, imbalance degrees, and languages (i.e., English, Roman Urdu and Urdu)
Experimental results show a strong performance of the proposed approach, which performs proficiently and consistently across three distinct datasets.
- Score: 2.406214748890827
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Detecting online sexual predatory behaviours and abusive language on social
media platforms has become a critical area of research due to the growing
concerns about online safety, especially for vulnerable populations such as
children and adolescents. Researchers have been exploring various techniques
and approaches to develop effective detection systems that can identify and
mitigate these risks. Recent development of large language models (LLMs) has
opened a new opportunity to address this problem more effectively. This paper
proposes an approach to detection of online sexual predatory chats and abusive
language using the open-source pretrained Llama 2 7B-parameter model, recently
released by Meta GenAI. We fine-tune the LLM using datasets with different
sizes, imbalance degrees, and languages (i.e., English, Roman Urdu and Urdu).
Based on the power of LLMs, our approach is generic and automated without a
manual search for a synergy between feature extraction and classifier design
steps like conventional methods in this domain. Experimental results show a
strong performance of the proposed approach, which performs proficiently and
consistently across three distinct datasets with five sets of experiments. This
study's outcomes indicate that the proposed method can be implemented in
real-world applications (even with non-English languages) for flagging sexual
predators, offensive or toxic content, hate speech, and discriminatory language
in online discussions and comments to maintain respectful internet or digital
communities. Furthermore, it can be employed for solving text classification
problems with other potential applications such as sentiment analysis, spam and
phishing detection, sorting legal documents, fake news detection, language
identification, user intent recognition, text-based product categorization,
medical record analysis, and resume screening.
Related papers
- Enhanced Online Grooming Detection Employing Context Determination and Message-Level Analysis [2.424910201171407]
Online grooming (OG) is a prevalent threat facing predominately children online, with groomers using deceptive methods to prey on the vulnerability of children on social media/messaging platforms.
Existing solutions focus on the signature analysis of child abuse media, which does not effectively address real-time OG detection.
This paper proposes that OG attacks are complex, requiring the identification of specific communication patterns between adults and children.
arXiv Detail & Related papers (2024-09-12T11:37:34Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - On the application of Large Language Models for language teaching and
assessment technology [18.735612275207853]
We look at the potential for incorporating large language models in AI-driven language teaching and assessment systems.
We find that larger language models offer improvements over previous models in text generation.
For automated grading and grammatical error correction, tasks whose progress is checked on well-known benchmarks, early investigations indicate that large language models on their own do not improve on state-of-the-art results.
arXiv Detail & Related papers (2023-07-17T11:12:56Z) - MGTBench: Benchmarking Machine-Generated Text Detection [54.81446366272403]
This paper proposes the first benchmark framework for MGT detection against powerful large language models (LLMs)
We show that a larger number of words in general leads to better performance and most detection methods can achieve similar performance with much fewer training samples.
Our findings indicate that the model-based detection methods still perform well in the text attribution task.
arXiv Detail & Related papers (2023-03-26T21:12:36Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Language Generation Models Can Cause Harm: So What Can We Do About It?
An Actionable Survey [50.58063811745676]
This work provides a survey of practical methods for addressing potential threats and societal harms from language generation models.
We draw on several prior works' of language model risks to present a structured overview of strategies for detecting and ameliorating different kinds of risks/harms of language generators.
arXiv Detail & Related papers (2022-10-14T10:43:39Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z) - Joint Modelling of Emotion and Abusive Language Detection [26.18171134454037]
We present the first joint model of emotion and abusive language detection, experimenting in a multi-task learning framework.
Our results demonstrate that incorporating affective features leads to significant improvements in abuse detection performance across datasets.
arXiv Detail & Related papers (2020-05-28T14:08:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.