Automating Security Audit Using Large Language Model based Agent: An Exploration Experiment
- URL: http://arxiv.org/abs/2505.10732v1
- Date: Thu, 15 May 2025 22:22:52 GMT
- Title: Automating Security Audit Using Large Language Model based Agent: An Exploration Experiment
- Authors: Jia Hui Chin, Pu Zhang, Yu Xin Cheong, Jonathan Pan,
- Abstract summary: Security audits help to maintain a strong security posture by ensuring that policies are in place, controls are implemented, gaps are identified for cybersecurity risks mitigation.<n>This paper looks at the possibility of developing a framework to leverage Large Language Models (LLMs) as an autonomous agent to execute part of the security audit.
- Score: 7.039114328812774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the current rapidly changing digital environment, businesses are under constant stress to ensure that their systems are secured. Security audits help to maintain a strong security posture by ensuring that policies are in place, controls are implemented, gaps are identified for cybersecurity risks mitigation. However, audits are usually manual, requiring much time and costs. This paper looks at the possibility of developing a framework to leverage Large Language Models (LLMs) as an autonomous agent to execute part of the security audit, namely with the field audit. password policy compliance for Windows operating system. Through the conduct of an exploration experiment of using GPT-4 with Langchain, the agent executed the audit tasks by accurately flagging password policy violations and appeared to be more efficient than traditional manual audits. Despite its potential limitations in operational consistency in complex and dynamic environment, the framework suggests possibilities to extend further to real-time threat monitoring and compliance checks.
Related papers
- Formal Verification of Neural Certificates Done Dynamically [7.146556437126553]
We propose a lightweight runtime monitoring framework that integrates real-time verification and does not require access to the underlying control policy.<n>Our approach enables timely detection of safety violations and incorrect certificates with minimal overhead.
arXiv Detail & Related papers (2025-07-16T07:37:23Z) - OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety [58.201189860217724]
We introduce OpenAgentSafety, a comprehensive framework for evaluating agent behavior across eight critical risk categories.<n>Unlike prior work, our framework evaluates agents that interact with real tools, including web browsers, code execution environments, file systems, bash shells, and messaging platforms.<n>It combines rule-based analysis with LLM-as-judge assessments to detect both overt and subtle unsafe behaviors.
arXiv Detail & Related papers (2025-07-08T16:18:54Z) - SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z) - Disclosure Audits for LLM Agents [44.27620230177312]
Large Language Model agents have begun to appear as personal assistants, customer service bots, and clinical aides.<n>This study proposes an auditing framework for conversational privacy that quantifies and audits these risks.
arXiv Detail & Related papers (2025-06-11T20:47:37Z) - LLM Agents Should Employ Security Principles [60.03651084139836]
This paper argues that the well-established design principles in information security should be employed when deploying Large Language Model (LLM) agents at scale.<n>We introduce AgentSandbox, a conceptual framework embedding these security principles to provide safeguards throughout an agent's life-cycle.
arXiv Detail & Related papers (2025-05-29T21:39:08Z) - Exposing the Ghost in the Transformer: Abnormal Detection for Large Language Models via Hidden State Forensics [5.384257830522198]
Large Language Models (LLMs) in critical applications have introduced severe reliability and security risks.<n>These vulnerabilities have been weaponized by malicious actors, leading to unauthorized access, widespread misinformation, and compromised system integrity.<n>We introduce a novel approach to detecting abnormal behaviors in LLMs via hidden state forensics.
arXiv Detail & Related papers (2025-04-01T05:58:14Z) - Towards Trustworthy GUI Agents: A Survey [64.6445117343499]
This survey examines the trustworthiness of GUI agents in five critical dimensions.<n>We identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making.<n>As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential.
arXiv Detail & Related papers (2025-03-30T13:26:00Z) - AgentOrca: A Dual-System Framework to Evaluate Language Agents on Operational Routine and Constraint Adherence [54.317522790545304]
We present AgentOrca, a dual-system framework for evaluating language agents' compliance with operational constraints and routines.<n>Our framework encodes action constraints and routines through both natural language prompts for agents and corresponding executable code serving as ground truth for automated verification.<n>Our findings reveal notable performance gaps among state-of-the-art models, with large reasoning models like o1 demonstrating superior compliance while others show significantly lower performance.
arXiv Detail & Related papers (2025-03-11T17:53:02Z) - AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration [0.3222802562733787]
AgentGuard is a framework to autonomously discover and validate unsafe tool-use.<n>It generates safety constraints to confine the behaviors of agents, achieving the baseline of safety guarantee.<n>The framework operates through four phases: identifying unsafe, validating them in real-world execution, generating safety constraints, and validating constraint efficacy.
arXiv Detail & Related papers (2025-02-13T23:00:33Z) - Autonomous Identity-Based Threat Segmentation in Zero Trust Architectures [4.169915659794567]
Zero Trust Architectures (ZTA) fundamentally redefine network security by adopting a "trust nothing, verify everything" approach.<n>This research applies the proposed AI-driven, autonomous, identity-based threat segmentation in ZTA.
arXiv Detail & Related papers (2025-01-10T15:35:02Z) - A Formal Model of Security Controls' Capabilities and Its Applications to Policy Refinement and Incident Management [0.2621730497733947]
This paper presents the Security Capability Model (SCM), a formal model that abstracts the features that security controls offer for enforcing security policies.<n>By validating its effectiveness in real-world scenarios, we show that SCM enables the automation of different and complex security tasks.
arXiv Detail & Related papers (2024-05-06T15:06:56Z) - A Survey and Comparative Analysis of Security Properties of CAN Authentication Protocols [92.81385447582882]
The Controller Area Network (CAN) bus leaves in-vehicle communications inherently non-secure.
This paper reviews and compares the 15 most prominent authentication protocols for the CAN bus.
We evaluate protocols based on essential operational criteria that contribute to ease of implementation.
arXiv Detail & Related papers (2024-01-19T14:52:04Z) - Value Functions are Control Barrier Functions: Verification of Safe
Policies using Control Theory [46.85103495283037]
We propose a new approach to apply verification methods from control theory to learned value functions.
We formalize original theorems that establish links between value functions and control barrier functions.
Our work marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems.
arXiv Detail & Related papers (2023-06-06T21:41:31Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.