Related papers: FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

URL: http://arxiv.org/abs/2601.07853v1
Date: Fri, 09 Jan 2026 03:25:45 GMT
Title: FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments
Authors: Zhi Yang, Runguo Li, Qiqi Qiang, Jiashun Wang, Fangqi Lou, Mengping Li, Dongpo Cheng, Rui Xu, Heng Lian, Shuo Zhang, Xiaolong Liang, Xiaoming Huang, Zheng Wei, Zhaowei Liu, Xin Guo, Huacan Wang, Ronghao Chen, Liwen Zhang,
Abstract summary: FinVault is the first execution-grounded security benchmark for financial agents.<n>We show that existing defense mechanisms remain ineffective in realistic financial agent settings.
Score: 29.19862712902908
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Financial agents powered by large language models (LLMs) are increasingly deployed for investment analysis, risk assessment, and automated decision-making, where their abilities to plan, invoke tools, and manipulate mutable state introduce new security risks in high-stakes and highly regulated financial environments. However, existing safety evaluations largely focus on language-model-level content compliance or abstract agent settings, failing to capture execution-grounded risks arising from real operational workflows and state-changing actions. To bridge this gap, we propose FinVault, the first execution-grounded security benchmark for financial agents, comprising 31 regulatory case-driven sandbox scenarios with state-writable databases and explicit compliance constraints, together with 107 real-world vulnerabilities and 963 test cases that systematically cover prompt injection, jailbreaking, financially adapted attacks, as well as benign inputs for false-positive evaluation. Experimental results reveal that existing defense mechanisms remain ineffective in realistic financial agent settings, with average attack success rates (ASR) still reaching up to 50.0\% on state-of-the-art models and remaining non-negligible even for the most robust systems (ASR 6.7\%), highlighting the limited transferability of current safety designs and the need for stronger financial-specific defenses. Our code can be found at https://github.com/aifinlab/FinVault.

Related papers

VERAFI: Verified Agentic Financial Intelligence through Neurosymbolic Policy Generation [2.43679682660038]
VERAFI is an agentic framework with neurosymbolic policy generation for verified financial intelligence.<n> VERAFI combines state-of-the-art dense retrieval and cross-encoder reranking with financial tool-enabled agents and automated reasoning policies.
arXiv Detail & Related papers (2025-12-12T17:17:43Z)
PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach [49.14349403242654]
We present $textbfPropensityBench$, a novel benchmark framework that assesses the proclivity of models to engage in risky behaviors.<n>Our framework includes 5,874 scenarios with 6,648 tools spanning four high-risk domains: cybersecurity, self-proliferation, biosecurity, and chemical security.<n>Across open-source and proprietary frontier models, we uncover 9 alarming signs of propensity: models frequently choose high-risk tools when under pressure.
arXiv Detail & Related papers (2025-11-24T18:46:44Z)
Uncovering the Vulnerability of Large Language Models in the Financial Domain via Risk Concealment [29.36824550283463]
Large Language Models (LLMs) are increasingly integrated into financial applications, yet existing red-teaming research primarily targets harmful content.<n>We introduce Risk-Concealment Attacks (RCA), a novel multi-turn framework that iteratively conceals regulatory risks to provoke seemingly compliant yet regulatory-violating responses.<n>Experiments on FIN-Bench demonstrate that RCA effectively bypasses nine mainstream LLMs, achieving an average attack success rate (ASR) of 93.18%, including 98.28% on GPT-4.1 and 97.56% on OpenAI o1.
arXiv Detail & Related papers (2025-09-07T22:35:15Z)
Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition [101.86739402748995]
We run the largest public red-teaming competition to date, targeting 22 frontier AI agents across 44 realistic deployment scenarios.<n>We build the Agent Red Teaming benchmark and evaluate it across 19 state-of-the-art models.<n>Our findings highlight critical and persistent vulnerabilities in today's AI agents.
arXiv Detail & Related papers (2025-07-28T05:13:04Z)
"Blockchain-Enabled Zero Trust Framework for Securing FinTech Ecosystems Against Insider Threats and Cyber Attacks" [0.6437284704257459]
Security mechanisms are insufficient against evolving threats like insider attacks, malware intrusions, and Advanced Persistent Threats (APTs)<n>This paper proposes a blockchain-integrated Zero Trust framework, adhering to the principle of "Never Trust, Always"<n>The framework uses smart contracts to enforce Multi Factor Authentication (MFA), Role-Based Access Control (RBAC), and Just-In-Time (JIT) access privileges.
arXiv Detail & Related papers (2025-07-26T15:21:04Z)
Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning [12.548390779247987]
We introduce the Agentar-Fin-R1 series of financial large language models.<n>Our optimization approach integrates a high-quality, systematic financial task label system.<n>Our models undergo comprehensive evaluation on mainstream financial benchmarks.
arXiv Detail & Related papers (2025-07-22T17:52:16Z)
OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety [58.201189860217724]
We introduce OpenAgentSafety, a comprehensive framework for evaluating agent behavior across eight critical risk categories.<n>Unlike prior work, our framework evaluates agents that interact with real tools, including web browsers, code execution environments, file systems, bash shells, and messaging platforms.<n>It combines rule-based analysis with LLM-as-judge assessments to detect both overt and subtle unsafe behaviors.
arXiv Detail & Related papers (2025-07-08T16:18:54Z)
Standard Benchmarks Fail - Auditing LLM Agents in Finance Must Prioritize Risk [31.43947127076459]
Standard benchmarks fixate on how well large language model (LLM) agents perform in finance, yet say little about whether they are safe to deploy.<n>We argue that accuracy metrics and return-based scores provide an illusion of reliability, overlooking vulnerabilities such as hallucinated facts, stale data, and adversarial prompt manipulation.
arXiv Detail & Related papers (2025-02-21T12:56:15Z)
Agent-SafetyBench: Evaluating the Safety of LLM Agents [72.92604341646691]
We introduce Agent-SafetyBench, a benchmark designed to evaluate the safety of large language models (LLMs)<n>Agent-SafetyBench encompasses 349 interaction environments and 2,000 test cases, evaluating 8 categories of safety risks and covering 10 common failure modes frequently encountered in unsafe interactions.<n>Our evaluation of 16 popular LLM agents reveals a concerning result: none of the agents achieves a safety score above 60%.
arXiv Detail & Related papers (2024-12-19T02:35:15Z)
Identifying the Risks of LM Agents with an LM-Emulated Sandbox [68.26587052548287]
Language Model (LM) agents and tools enable a rich set of capabilities but also amplify potential risks. High cost of testing these agents will make it increasingly difficult to find high-stakes, long-tailed risks. We introduce ToolEmu: a framework that uses an LM to emulate tool execution and enables the testing of LM agents against a diverse range of tools and scenarios.
arXiv Detail & Related papers (2023-09-25T17:08:02Z)
Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.