Large Language Models for Automatic Detection of Sensitive Topics
- URL: http://arxiv.org/abs/2409.00940v1
- Date: Mon, 2 Sep 2024 04:50:42 GMT
- Title: Large Language Models for Automatic Detection of Sensitive Topics
- Authors: Ruoyu Wen, Stephanie Elena Crowe, Kunal Gupta, Xinyue Li, Mark Billinghurst, Simon Hoermann, Dwain Allan, Alaeddin Nassani, Thammathip Piumsomboon,
- Abstract summary: Large language models (LLMs) are known for their capability to understand and process natural language.
This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain.
The best-performing model, GPT-4o, achieved an average accuracy of 99.5% and an F1-score of 0.99.
- Score: 20.929598260734995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sensitive information detection is crucial in content moderation to maintain safe online communities. Assisting in this traditionally manual process could relieve human moderators from overwhelming and tedious tasks, allowing them to focus solely on flagged content that may pose potential risks. Rapidly advancing large language models (LLMs) are known for their capability to understand and process natural language and so present a potential solution to support this process. This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain within two online datasets and assesses their performance in terms of accuracy, precision, recall, F1 scores, and consistency. Our findings indicate that LLMs have the potential to be integrated into the moderation workflow as a convenient and precise detection tool. The best-performing model, GPT-4o, achieved an average accuracy of 99.5\% and an F1-score of 0.99. We discuss the advantages and potential challenges of using LLMs in the moderation workflow and suggest that future research should address the ethical considerations of utilising this technology.
Related papers
- A Soft Sensor Method with Uncertainty-Awareness and Self-Explanation Based on Large Language Models Enhanced by Domain Knowledge Retrieval [17.605817344542345]
We propose a framework called Few-shot Uncertainty-aware and self-Explaining Soft Sensor (LLM-FUESS)
LLM-FUESS includes the Zero-shot Auxiliary Variable Selector (LLM-ZAVS) and the Uncertainty-aware Few-shot Soft Sensor (LLM-UFSS)
Our method achieved state-of-the-art predictive performance, strong robustness, and flexibility, effectively mitigates training instability found in traditional methods.
arXiv Detail & Related papers (2025-01-06T11:43:29Z) - Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification [0.0]
This study evaluates the efficacy of Large Language Models (LLMs) as innovative solutions for mitigating misinformation on platforms like Twitter.
LLMs offer a pre-trained, adaptable approach that bypasses the extensive training and overfitting issues associated with traditional machine learning models.
We present a comparative analysis of LLMs' performance using a specialized dataset and propose a framework for their application in public health communication.
arXiv Detail & Related papers (2024-12-21T05:02:26Z) - Reasoner Outperforms: Generative Stance Detection with Rationalization for Social Media [12.479554210753664]
This study adopts a generative approach, where stance predictions include explicit, interpretable rationales.
We find that incorporating reasoning into stance detection enables the smaller model (FlanT5) to outperform GPT-3.5's zero-shot performance.
arXiv Detail & Related papers (2024-12-13T16:34:39Z) - Evaluating Cultural and Social Awareness of LLM Web Agents [113.49968423990616]
We introduce CASA, a benchmark designed to assess large language models' sensitivity to cultural and social norms.
Our approach evaluates LLM agents' ability to detect and appropriately respond to norm-violating user queries and observations.
Experiments show that current LLMs perform significantly better in non-agent environments.
arXiv Detail & Related papers (2024-10-30T17:35:44Z) - Evaluating the Usability of LLMs in Threat Intelligence Enrichment [0.30723404270319693]
Large Language Models (LLMs) have the potential to significantly enhance threat intelligence.
However, concerns about their reliability, accuracy, and potential for generating inaccurate information persist.
This study conducts a comprehensive usability evaluation of five LLMs ChatGPT, Gemini, Cohere, Copilot, and Meta AI.
arXiv Detail & Related papers (2024-09-23T14:44:56Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - The Human Factor in Detecting Errors of Large Language Models: A Systematic Literature Review and Future Research Directions [0.0]
Launch of ChatGPT by OpenAI in November 2022 marked a pivotal moment for Artificial Intelligence.
Large Language Models (LLMs) demonstrate remarkable conversational capabilities across various domains.
These models are susceptible to errors - "hallucinations" and omissions, generating incorrect or incomplete information.
arXiv Detail & Related papers (2024-03-13T21:39:39Z) - GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing [74.68232970965595]
Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos.
This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks.
arXiv Detail & Related papers (2024-03-09T13:56:25Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems.
We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z) - Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs [60.61002524947733]
Previous confidence elicitation methods rely on white-box access to internal model information or model fine-tuning.
This leads to a growing need to explore the untapped area of black-box approaches for uncertainty estimation.
We define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.
arXiv Detail & Related papers (2023-06-22T17:31:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.