Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense
- URL: http://arxiv.org/abs/2602.09012v1
- Date: Mon, 09 Feb 2026 18:55:33 GMT
- Title: Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense
- Authors: Jiacheng Liu, Yaxin Luo, Jiacheng Cui, Xinyi Shang, Xiaohan Zhao, Zhiqiang Shen,
- Abstract summary: We introduce Next-Gen CAPTCHAs, a scalable defense framework to secure the next-generation web against advanced agents.<n>Unlike static datasets, our benchmark is built upon a robust data generation pipeline.<n>We exploit the persistent human-agent "Cognitive Gap" in interactive perception, memory, decision-making, and action.
- Score: 39.68941971572086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid evolution of GUI-enabled agents has rendered traditional CAPTCHAs obsolete. While previous benchmarks like OpenCaptchaWorld established a baseline for evaluating multimodal agents, recent advancements in reasoning-heavy models, such as Gemini3-Pro-High and GPT-5.2-Xhigh have effectively collapsed this security barrier, achieving pass rates as high as 90% on complex logic puzzles like "Bingo". In response, we introduce Next-Gen CAPTCHAs, a scalable defense framework designed to secure the next-generation web against the advanced agents. Unlike static datasets, our benchmark is built upon a robust data generation pipeline, allowing for large-scale and easily scalable evaluations, notably, for backend-supported types, our system is capable of generating effectively unbounded CAPTCHA instances. We exploit the persistent human-agent "Cognitive Gap" in interactive perception, memory, decision-making, and action. By engineering dynamic tasks that require adaptive intuition rather than granular planning, we re-establish a robust distinction between biological users and artificial agents, offering a scalable and diverse defense mechanism for the agentic era.
Related papers
- AI-NativeBench: An Open-Source White-Box Agentic Benchmark Suite for AI-Native Systems [52.65695508605237]
We introduce AI-NativeBench, the first application-centric and white-box AI-Native benchmark suite grounded in Model Context Protocol (MCP) and Agent-to-Agent (A2A) standards.<n>By treating agentic spans as first-class citizens within distributed traces, our methodology enables granular analysis of engineering characteristics beyond simple capabilities.<n>This work provides the first systematic evidence to guide the transition from measuring model capability to engineering reliable AI-Native systems.
arXiv Detail & Related papers (2026-01-14T11:32:07Z) - Towards Efficient Agents: A Co-Design of Inference Architecture and System [66.59916327634639]
This paper presents AgentInfer, a unified framework for end-to-end agent acceleration.<n>We decompose the problem into four synergistic components: AgentCollab, AgentSched, AgentSAM, and AgentCompress.<n>Experiments on the BrowseComp-zh and DeepDiver benchmarks demonstrate that through the synergistic collaboration of these methods, AgentInfer reduces ineffective token consumption by over 50%.
arXiv Detail & Related papers (2025-12-20T12:06:13Z) - SCOPE: Prompt Evolution for Enhancing Agent Effectiveness [53.75986399936395]
Large Language Model (LLM) agents are increasingly deployed in environments that generate massive, dynamic contexts.<n>While agents have access to this context, their static prompts lack the mechanisms to manage it effectively.<n>We introduce textbfSCOPE (Self-evolving Context Optimization via Prompt Evolution)<n>We propose a Dual-Stream mechanism that balances tactical specificity (resolving immediate errors) with strategic generality (evolving long-term principles)
arXiv Detail & Related papers (2025-12-17T12:25:05Z) - The Evolution of Agentic AI in Cybersecurity: From Single LLM Reasoners to Multi-Agent Systems and Autonomous Pipelines [0.0]
Cybersecurity has become one of the earliest adopters of agentic AI.<n>This survey presents a five-generation taxonomy of agentic AI in cybersecurity.
arXiv Detail & Related papers (2025-12-07T05:10:16Z) - Chameleon: Adaptive Adversarial Agents for Scaling-Based Visual Prompt Injection in Multimodal AI Systems [0.0]
We propose a novel, adaptive adversarial framework designed to expose and exploit scaling vulnerabilities in production Vision-Language Models (VLMs)<n>Our experiments demonstrate that Chameleon achieves an Attack Success Rate (ASR) of 84.5% across varying scaling factors.<n>We show that these attacks effectively compromise agentic pipelines, reducing decision-making accuracy by over 45% in multi-step tasks.
arXiv Detail & Related papers (2025-12-04T15:22:28Z) - RoBCtrl: Attacking GNN-Based Social Bot Detectors via Reinforced Manipulation of Bots Control Interaction [51.46634975923564]
This paper proposes the first adversarial multi-agent Reinforcement learning framework for social Bot control attacks (RoBCtrl)<n> Specifically, we use a diffusion model to generate high-fidelity bot accounts by reconstructing existing account data with minor modifications.<n>We then employ a Multi-Agent Reinforcement Learning (MARL) method to simulate bots adversarial behavior.
arXiv Detail & Related papers (2025-10-16T02:41:49Z) - Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation [15.668734718800065]
We present a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs.<n>Unlike existing CAPTCHAs which rely on low-level perception tasks that are vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, and mental rotation.<n> Evaluation on a corresponding benchmark, Spatial-CAPTCHA-Bench, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0% Pass@1 accuracy.
arXiv Detail & Related papers (2025-10-04T16:19:21Z) - A Hybrid CAPTCHA Combining Generative AI with Keystroke Dynamics for Enhanced Bot Detection [0.0]
This paper introduces a novel hybrid CAPTCHA system that synergizes the cognitive challenges posed by Large Language Models (LLMs) with the behavioral biometric analysis of keystroke dynamics.<n>Our approach generates dynamic, unpredictable questions that are trivial for humans but non-trivial for automated agents, while simultaneously analyzing the user's typing rhythm to distinguish human patterns from robotic input.
arXiv Detail & Related papers (2025-09-29T17:56:13Z) - WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback [78.55946306325914]
We identify key reasoning skills essential for effective web agents.<n>We reconstruct the agent's reasoning algorithms into chain-of-thought rationales.<n>Our approach yields significant improvements across multiple benchmarks.
arXiv Detail & Related papers (2025-05-26T14:03:37Z) - BounTCHA: A CAPTCHA Utilizing Boundary Identification in Guided Generative AI-extended Videos [4.873950690073118]
Bots have increasingly been able to bypass most existing CAPTCHA systems, posing significant security threats to web applications.<n>We design and implement BounTCHA, a CAPTCHA mechanism that leverages human perception of boundaries in video transitions and disruptions.<n>We develop a prototype and conduct experiments to collect data on humans' time biases in boundary identification.
arXiv Detail & Related papers (2025-01-30T18:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.