Large Language Models for Automatic Detection of Sensitive Topics
- URL: http://arxiv.org/abs/2409.00940v1
- Date: Mon, 2 Sep 2024 04:50:42 GMT
- Title: Large Language Models for Automatic Detection of Sensitive Topics
- Authors: Ruoyu Wen, Stephanie Elena Crowe, Kunal Gupta, Xinyue Li, Mark Billinghurst, Simon Hoermann, Dwain Allan, Alaeddin Nassani, Thammathip Piumsomboon,
- Abstract summary: Large language models (LLMs) are known for their capability to understand and process natural language.
This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain.
The best-performing model, GPT-4o, achieved an average accuracy of 99.5% and an F1-score of 0.99.
- Score: 20.929598260734995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sensitive information detection is crucial in content moderation to maintain safe online communities. Assisting in this traditionally manual process could relieve human moderators from overwhelming and tedious tasks, allowing them to focus solely on flagged content that may pose potential risks. Rapidly advancing large language models (LLMs) are known for their capability to understand and process natural language and so present a potential solution to support this process. This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain within two online datasets and assesses their performance in terms of accuracy, precision, recall, F1 scores, and consistency. Our findings indicate that LLMs have the potential to be integrated into the moderation workflow as a convenient and precise detection tool. The best-performing model, GPT-4o, achieved an average accuracy of 99.5\% and an F1-score of 0.99. We discuss the advantages and potential challenges of using LLMs in the moderation workflow and suggest that future research should address the ethical considerations of utilising this technology.
Related papers
- Evaluating Cultural and Social Awareness of LLM Web Agents [113.49968423990616]
We introduce CASA, a benchmark designed to assess large language models' sensitivity to cultural and social norms.
Our approach evaluates LLM agents' ability to detect and appropriately respond to norm-violating user queries and observations.
Experiments show that current LLMs perform significantly better in non-agent environments.
arXiv Detail & Related papers (2024-10-30T17:35:44Z) - Evaluating the Usability of LLMs in Threat Intelligence Enrichment [0.30723404270319693]
Large Language Models (LLMs) have the potential to significantly enhance threat intelligence.
However, concerns about their reliability, accuracy, and potential for generating inaccurate information persist.
This study conducts a comprehensive usability evaluation of five LLMs ChatGPT, Gemini, Cohere, Copilot, and Meta AI.
arXiv Detail & Related papers (2024-09-23T14:44:56Z) - Zero-Shot Learning and Key Points Are All You Need for Automated Fact-Checking [10.788661063801703]
This work introduces a framework based on Zero-Shot Learning and Key Points (ZSL-KeP) for automated fact-checking.
It performs well on the AVeriTeC shared task dataset by robustly improving the baseline and achieving 10th place.
arXiv Detail & Related papers (2024-08-15T19:57:42Z) - Towards Explainable Network Intrusion Detection using Large Language Models [3.8436076642278745]
Large Language Models (LLMs) have revolutionised natural language processing tasks, particularly as chat agents.
This paper examines the feasibility of employing LLMs as a Network Intrusion Detection System (NIDS)
Preliminary exploration shows that LLMs are unfit for the detection of Malicious NetFlows.
Most promisingly, these exhibit significant potential as complementary agents in NIDS, particularly in providing explanations and aiding in threat response when integrated with Retrieval Augmented Generation (RAG) and function calling capabilities.
arXiv Detail & Related papers (2024-08-08T09:59:30Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - The Human Factor in Detecting Errors of Large Language Models: A Systematic Literature Review and Future Research Directions [0.0]
Launch of ChatGPT by OpenAI in November 2022 marked a pivotal moment for Artificial Intelligence.
Large Language Models (LLMs) demonstrate remarkable conversational capabilities across various domains.
These models are susceptible to errors - "hallucinations" and omissions, generating incorrect or incomplete information.
arXiv Detail & Related papers (2024-03-13T21:39:39Z) - GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing [74.68232970965595]
Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos.
This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks.
arXiv Detail & Related papers (2024-03-09T13:56:25Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities [12.82645410161464]
We evaluate the effectiveness of 16 pre-trained Large Language Models on 5,000 code samples from five diverse security datasets.
Overall, LLMs show modest effectiveness in detecting vulnerabilities, obtaining an average accuracy of 62.8% and F1 score of 0.71 across datasets.
We find that advanced prompting strategies that involve step-by-step analysis significantly improve performance of LLMs on real-world datasets in terms of F1 score (by upto 0.18 on average)
arXiv Detail & Related papers (2023-11-16T13:17:20Z) - Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems.
We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z) - Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs [60.61002524947733]
Previous confidence elicitation methods rely on white-box access to internal model information or model fine-tuning.
This leads to a growing need to explore the untapped area of black-box approaches for uncertainty estimation.
We define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.
arXiv Detail & Related papers (2023-06-22T17:31:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.