Related papers: Evaluating AI cyber capabilities with crowdsourced elicitation

Evaluating AI cyber capabilities with crowdsourced elicitation

URL: http://arxiv.org/abs/2505.19915v2
Date: Tue, 27 May 2025 17:45:40 GMT
Title: Evaluating AI cyber capabilities with crowdsourced elicitation
Authors: Artem Petrov, Dmitrii Volkov,
Abstract summary: We propose elicitation bounties as a practical mechanism for maintaining timely, cost-effective situational awareness of emerging AI capabilities.<n>Applying METR's methodology, we found that AI agents can reliably solve cyber challenges requiring one hour or less of effort from a median human CTF participant.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As AI systems become increasingly capable, understanding their offensive cyber potential is critical for informed governance and responsible deployment. However, it's hard to accurately bound their capabilities, and some prior evaluations dramatically underestimated them. The art of extracting maximum task-specific performance from AIs is called "AI elicitation", and today's safety organizations typically conduct it in-house. In this paper, we explore crowdsourcing elicitation efforts as an alternative to in-house elicitation work. We host open-access AI tracks at two Capture The Flag (CTF) competitions: AI vs. Humans (400 teams) and Cyber Apocalypse (8000 teams). The AI teams achieve outstanding performance at both events, ranking top-5% and top-10% respectively for a total of \$7500 in bounties. This impressive performance suggests that open-market elicitation may offer an effective complement to in-house elicitation. We propose elicitation bounties as a practical mechanism for maintaining timely, cost-effective situational awareness of emerging AI capabilities. Another advantage of open elicitations is the option to collect human performance data at scale. Applying METR's methodology, we found that AI agents can reliably solve cyber challenges requiring one hour or less of effort from a median human CTF participant.

Related papers

Actionable AI: Enabling Non Experts to Understand and Configure AI Systems [5.534140394498714]
Actionable AI allows non-experts to configure black-box agents.<n>In uncertain conditions, non-experts achieve good levels of performance.<n>We propose Actionable AI as a way to open access to AI-based agents.
arXiv Detail & Related papers (2025-03-09T23:09:04Z)
Superintelligence Strategy: Expert Version [64.7113737051525]
Destabilizing AI developments could raise the odds of great-power conflict.<n>Superintelligence -- AI vastly better than humans at nearly all cognitive tasks -- is now anticipated by AI researchers.<n>We introduce the concept of Mutual Assured AI Malfunction.
arXiv Detail & Related papers (2025-03-07T17:53:24Z)
How Performance Pressure Influences AI-Assisted Decision Making [57.53469908423318]
We show how pressure and explainable AI (XAI) techniques interact with AI advice-taking behavior.<n>Our results show complex interaction effects, with different combinations of pressure and XAI techniques either improving or worsening AI advice taking behavior.
arXiv Detail & Related papers (2024-10-21T22:39:52Z)
Comparing Zealous and Restrained AI Recommendations in a Real-World Human-AI Collaboration Task [11.040918613968854]
We argue that careful exploitation of the tradeoff between precision and recall can significantly improve team performance. We analyze the performance of 78 professional annotators working with a) no AI assistance, b) a high-precision "restrained" AI, and c) a high-recall "zealous" AI in over 3,466 person-hours of annotation work.
arXiv Detail & Related papers (2024-10-06T23:19:19Z)
Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs [10.844598404826355]
One-third of U.S. employment is highly exposed to AI, primarily in high-skill jobs requiring a graduate or postgraduate level of education.<n>Even in high-skill occupations, AI exhibits high variability in task substitution, suggesting that AI and humans complement each other within the same occupation.<n>All results, models, and code are freely available online to allow the community to reproduce our results, compare outcomes, and use our work as a benchmark to monitor AI's progress over time.
arXiv Detail & Related papers (2024-07-27T08:14:18Z)
Work-in-Progress: Crash Course: Can (Under Attack) Autonomous Driving Beat Human Drivers? [60.51287814584477]
This paper evaluates the inherent risks in autonomous driving by examining the current landscape of AVs. We develop specific claims highlighting the delicate balance between the advantages of AVs and potential security challenges in real-world scenarios.
arXiv Detail & Related papers (2024-05-14T09:42:21Z)
Towards an AI-Enhanced Cyber Threat Intelligence Processing Pipeline [0.0]
This paper explores the potential of integrating Artificial Intelligence (AI) into Cyber Threat Intelligence (CTI) We provide a blueprint of an AI-enhanced CTI processing pipeline, and detail its components and functionalities. We discuss ethical dilemmas, potential biases, and the imperative for transparency in AI-driven decisions.
arXiv Detail & Related papers (2024-03-05T19:03:56Z)
DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan. We first put forward an AI program named DanZero for this game. In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z)
Bending the Automation Bias Curve: A Study of Human and AI-based Decision Making in National Security Contexts [0.0]
We theorize about the relationship between background knowledge about AI, trust in AI, and how these interact with other factors to influence the probability of automation bias. We test these in a preregistered task identification experiment across a representative sample of 9000 adults in 9 countries with varying levels of AI industries.
arXiv Detail & Related papers (2023-06-28T18:57:36Z)
Fairness in AI and Its Long-Term Implications on Society [68.8204255655161]
We take a closer look at AI fairness and analyze how lack of AI fairness can lead to deepening of biases over time. We discuss how biased models can lead to more negative real-world outcomes for certain groups. If the issues persist, they could be reinforced by interactions with other risks and have severe implications on society in the form of social unrest.
arXiv Detail & Related papers (2023-04-16T11:22:59Z)
On the Influence of Explainable AI on Automation Bias [0.0]
We aim to shed light on the potential to influence automation bias by explainable AI (XAI) We conduct an online experiment with regard to hotel review classifications and discuss first results.
arXiv Detail & Related papers (2022-04-19T12:54:23Z)
Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z)
Trustworthy AI: A Computational Perspective [54.80482955088197]
We focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems.
arXiv Detail & Related papers (2021-07-12T14:21:46Z)
Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance. We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves. Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.