Bridging Expertise Gaps: The Role of LLMs in Human-AI Collaboration for Cybersecurity
- URL: http://arxiv.org/abs/2505.03179v1
- Date: Tue, 06 May 2025 04:47:52 GMT
- Title: Bridging Expertise Gaps: The Role of LLMs in Human-AI Collaboration for Cybersecurity
- Authors: Shahroz Tariq, Ronal Singh, Mohan Baruwal Chhetri, Surya Nepal, Cecile Paris,
- Abstract summary: This study investigates whether large language models (LLMs) can function as intelligent collaborators to bridge expertise gaps in cybersecurity decision-making.<n>We find that human-AI collaboration improves task performance,reducing false positives in phishing detection and false negatives in intrusion detection.
- Score: 17.780795900414716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study investigates whether large language models (LLMs) can function as intelligent collaborators to bridge expertise gaps in cybersecurity decision-making. We examine two representative tasks-phishing email detection and intrusion detection-that differ in data modality, cognitive complexity, and user familiarity. Through a controlled mixed-methods user study, n = 58 (phishing, n = 34; intrusion, n = 24), we find that human-AI collaboration improves task performance,reducing false positives in phishing detection and false negatives in intrusion detection. A learning effect is also observed when participants transition from collaboration to independent work, suggesting that LLMs can support long-term skill development. Our qualitative analysis shows that interaction dynamics-such as LLM definitiveness, explanation style, and tone-influence user trust, prompting strategies, and decision revision. Users engaged in more analytic questioning and showed greater reliance on LLM feedback in high-complexity settings. These results provide design guidance for building interpretable, adaptive, and trustworthy human-AI teaming systems, and demonstrate that LLMs can meaningfully support non-experts in reasoning through complex cybersecurity problems.
Related papers
- Teaching Language Models To Gather Information Proactively [53.85419549904644]
Large language models (LLMs) are increasingly expected to function as collaborative partners.<n>In this work, we introduce a new task paradigm: proactive information gathering.<n>We design a scalable framework that generates partially specified, real-world tasks, masking key information.<n>Within this setup, our core innovation is a reinforcement finetuning strategy that rewards questions that elicit genuinely new, implicit user information.
arXiv Detail & Related papers (2025-07-28T23:50:09Z) - Multimodal Behavioral Patterns Analysis with Eye-Tracking and LLM-Based Reasoning [12.054910727620154]
Eye-tracking data reveals valuable insights into users' cognitive states but is difficult to analyze due to its structured, non-linguistic nature.<n>This paper presents a multimodal human-AI collaborative framework designed to enhance cognitive pattern extraction from eye-tracking signals.
arXiv Detail & Related papers (2025-07-24T09:49:53Z) - Evaluating LLM-corrupted Crowdsourcing Data Without Ground Truth [21.672923905771576]
Large language models (LLMs) by crowdsourcing workers pose a challenge to datasets intended to reflect human input.<n>We propose a training-free scoring mechanism with theoretical guarantees under a crowdsourcing model that accounts for LLM collusion.
arXiv Detail & Related papers (2025-06-08T04:38:39Z) - A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z) - Understanding Learner-LLM Chatbot Interactions and the Impact of Prompting Guidelines [9.834055425277874]
This study investigates learner-AI interactions through an educational experiment in which participants receive structured guidance on effective prompting.<n>To assess user behavior and prompting efficacy, we analyze a dataset of 642 interactions from 107 users.<n>Our findings provide a deeper understanding of how users engage with Large Language Models and the role of structured prompting guidance in enhancing AI-assisted communication.
arXiv Detail & Related papers (2025-04-10T15:20:43Z) - LLMs as Deceptive Agents: How Role-Based Prompting Induces Semantic Ambiguity in Puzzle Tasks [0.0]
This study is inspired by the popular puzzle game "Connections"<n>We compare puzzles produced through zero-shot prompting, role-injected adversarial prompts, and human-crafted examples.<n>We demonstrate that explicit adversarial agent behaviors significantly heighten semantic ambiguity.
arXiv Detail & Related papers (2025-04-03T03:45:58Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - Fine-Grained Appropriate Reliance: Human-AI Collaboration with a Multi-Step Transparent Decision Workflow for Complex Task Decomposition [14.413413322901409]
We propose to investigate the impact of a novel Multi-Step Transparent (MST) decision workflow on user reliance behaviors.<n>Our findings demonstrate that human-AI collaboration with an MST decision workflow can outperform one-step collaboration in specific contexts.<n>Our work highlights that there is no one-size-fits-all decision workflow that can help obtain optimal human-AI collaboration.
arXiv Detail & Related papers (2025-01-19T01:03:09Z) - LLM-Consensus: Multi-Agent Debate for Visual Misinformation Detection [26.84072878231029]
LLM-Consensus is a novel multi-agent debate system for misinformation detection.<n>Our framework enables explainable detection with state-of-the-art accuracy.
arXiv Detail & Related papers (2024-10-26T10:34:22Z) - Critic-CoT: Boosting the reasoning abilities of large language model via Chain-of-thoughts Critic [48.94340387130627]
Critic-CoT is a framework that pushes LLMs toward System-2-like critic capability.
CoT reasoning paradigm and the automatic construction of distant-supervision data without human annotation.
Experiments on GSM8K and MATH demonstrate that our enhanced model significantly boosts task-solving performance.
arXiv Detail & Related papers (2024-08-29T08:02:09Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Predicting challenge moments from students' discourse: A comparison of
GPT-4 to two traditional natural language processing approaches [0.3826704341650507]
This study investigates the potential of leveraging three distinct natural language processing models.
An expert knowledge rule-based model, a supervised machine learning (ML) model and a Large Language model (LLM) were investigated.
The results show that the supervised ML and the LLM approaches performed considerably well in both tasks.
arXiv Detail & Related papers (2024-01-03T11:54:30Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.