Related papers: Large Language Models for Automatic Detection of Sensitive Topics

Large Language Models for Automatic Detection of Sensitive Topics

URL: http://arxiv.org/abs/2409.00940v1
Date: Mon, 2 Sep 2024 04:50:42 GMT
Title: Large Language Models for Automatic Detection of Sensitive Topics
Authors: Ruoyu Wen, Stephanie Elena Crowe, Kunal Gupta, Xinyue Li, Mark Billinghurst, Simon Hoermann, Dwain Allan, Alaeddin Nassani, Thammathip Piumsomboon,
Abstract summary: Large language models (LLMs) are known for their capability to understand and process natural language. This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain. The best-performing model, GPT-4o, achieved an average accuracy of 99.5% and an F1-score of 0.99.
Score: 20.929598260734995
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sensitive information detection is crucial in content moderation to maintain safe online communities. Assisting in this traditionally manual process could relieve human moderators from overwhelming and tedious tasks, allowing them to focus solely on flagged content that may pose potential risks. Rapidly advancing large language models (LLMs) are known for their capability to understand and process natural language and so present a potential solution to support this process. This study explores the capabilities of five LLMs for detecting sensitive messages in the mental well-being domain within two online datasets and assesses their performance in terms of accuracy, precision, recall, F1 scores, and consistency. Our findings indicate that LLMs have the potential to be integrated into the moderation workflow as a convenient and precise detection tool. The best-performing model, GPT-4o, achieved an average accuracy of 99.5\% and an F1-score of 0.99. We discuss the advantages and potential challenges of using LLMs in the moderation workflow and suggest that future research should address the ethical considerations of utilising this technology.

Related papers

Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z)
Advancing Harmful Content Detection in Organizational Research: Integrating Large Language Models with Elo Rating System [0.0]
Large language models (LLMs) offer promising opportunities for organizational research.<n>Their built-in moderation systems can create problems when researchers try to analyze harmful content.<n>This paper introduces an Elo rating-based method that significantly improves LLM performance for harmful content analysis.
arXiv Detail & Related papers (2025-06-19T20:01:12Z)
LLM Post-Training: A Deep Dive into Reasoning Large Language Models [131.10969986056]
Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications. Post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations.
arXiv Detail & Related papers (2025-02-28T18:59:54Z)
Self-Evolving Critique Abilities in Large Language Models [59.861013614500024]
This paper explores enhancing critique abilities of Large Language Models (LLMs)<n>We introduce SCRIT, a framework that trains LLMs with self-generated data to evolve their critique abilities.<n>Our analysis reveals that SCRIT's performance scales positively with data and model size.
arXiv Detail & Related papers (2025-01-10T05:51:52Z)
A Soft Sensor Method with Uncertainty-Awareness and Self-Explanation Based on Large Language Models Enhanced by Domain Knowledge Retrieval [17.605817344542345]
We propose a framework called Few-shot Uncertainty-aware and self-Explaining Soft Sensor (LLM-FUESS) LLM-FUESS includes the Zero-shot Auxiliary Variable Selector (LLM-ZAVS) and the Uncertainty-aware Few-shot Soft Sensor (LLM-UFSS) Our method achieved state-of-the-art predictive performance, strong robustness, and flexibility, effectively mitigates training instability found in traditional methods.
arXiv Detail & Related papers (2025-01-06T11:43:29Z)
Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification [0.0]
This study evaluates the efficacy of Large Language Models (LLMs) as innovative solutions for mitigating misinformation on platforms like Twitter. LLMs offer a pre-trained, adaptable approach that bypasses the extensive training and overfitting issues associated with traditional machine learning models. We present a comparative analysis of LLMs' performance using a specialized dataset and propose a framework for their application in public health communication.
arXiv Detail & Related papers (2024-12-21T05:02:26Z)
Reasoner Outperforms: Generative Stance Detection with Rationalization for Social Media [12.479554210753664]
This study adopts a generative approach, where stance predictions include explicit, interpretable rationales. We find that incorporating reasoning into stance detection enables the smaller model (FlanT5) to outperform GPT-3.5's zero-shot performance.
arXiv Detail & Related papers (2024-12-13T16:34:39Z)
Evaluating Cultural and Social Awareness of LLM Web Agents [113.49968423990616]
We introduce CASA, a benchmark designed to assess large language models' sensitivity to cultural and social norms. Our approach evaluates LLM agents' ability to detect and appropriately respond to norm-violating user queries and observations. Experiments show that current LLMs perform significantly better in non-agent environments.
arXiv Detail & Related papers (2024-10-30T17:35:44Z)
Evaluating the Usability of LLMs in Threat Intelligence Enrichment [0.30723404270319693]
Large Language Models (LLMs) have the potential to significantly enhance threat intelligence. However, concerns about their reliability, accuracy, and potential for generating inaccurate information persist. This study conducts a comprehensive usability evaluation of five LLMs ChatGPT, Gemini, Cohere, Copilot, and Meta AI.
arXiv Detail & Related papers (2024-09-23T14:44:56Z)
Zero-Shot Learning and Key Points Are All You Need for Automated Fact-Checking [10.788661063801703]
This work introduces a framework based on Zero-Shot Learning and Key Points (ZSL-KeP) for automated fact-checking. It performs well on the AVeriTeC shared task dataset by robustly improving the baseline and achieving 10th place.
arXiv Detail & Related papers (2024-08-15T19:57:42Z)
Towards Explainable Network Intrusion Detection using Large Language Models [3.8436076642278745]
Large Language Models (LLMs) have revolutionised natural language processing tasks, particularly as chat agents. This paper examines the feasibility of employing LLMs as a Network Intrusion Detection System (NIDS) Preliminary exploration shows that LLMs are unfit for the detection of Malicious NetFlows. Most promisingly, these exhibit significant potential as complementary agents in NIDS, particularly in providing explanations and aiding in threat response when integrated with Retrieval Augmented Generation (RAG) and function calling capabilities.
arXiv Detail & Related papers (2024-08-08T09:59:30Z)
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses. Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives. The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z)
The Human Factor in Detecting Errors of Large Language Models: A Systematic Literature Review and Future Research Directions [0.0]
Launch of ChatGPT by OpenAI in November 2022 marked a pivotal moment for Artificial Intelligence. Large Language Models (LLMs) demonstrate remarkable conversational capabilities across various domains. These models are susceptible to errors - "hallucinations" and omissions, generating incorrect or incomplete information.
arXiv Detail & Related papers (2024-03-13T21:39:39Z)
GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing [74.68232970965595]
Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos. This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks.
arXiv Detail & Related papers (2024-03-09T13:56:25Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities [12.82645410161464]
We evaluate the effectiveness of 16 pre-trained Large Language Models on 5,000 code samples from five diverse security datasets. Overall, LLMs show modest effectiveness in detecting vulnerabilities, obtaining an average accuracy of 62.8% and F1 score of 0.71 across datasets. We find that advanced prompting strategies that involve step-by-step analysis significantly improve performance of LLMs on real-world datasets in terms of F1 score (by upto 0.18 on average)
arXiv Detail & Related papers (2023-11-16T13:17:20Z)
Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly [62.473245910234304]
This paper takes a hardware-centric approach to explore how Large Language Models can be brought to modern edge computing systems. We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and study the network utilization in realistic conditions.
arXiv Detail & Related papers (2023-10-04T20:27:20Z)
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs [60.61002524947733]
Previous confidence elicitation methods rely on white-box access to internal model information or model fine-tuning. This leads to a growing need to explore the untapped area of black-box approaches for uncertainty estimation. We define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.
arXiv Detail & Related papers (2023-06-22T17:31:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.