Related papers: Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense

Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense

URL: http://arxiv.org/abs/2602.09012v1
Date: Mon, 09 Feb 2026 18:55:33 GMT
Title: Next-Gen CAPTCHAs: Leveraging the Cognitive Gap for Scalable and Diverse GUI-Agent Defense
Authors: Jiacheng Liu, Yaxin Luo, Jiacheng Cui, Xinyi Shang, Xiaohan Zhao, Zhiqiang Shen,
Abstract summary: We introduce Next-Gen CAPTCHAs, a scalable defense framework to secure the next-generation web against advanced agents.<n>Unlike static datasets, our benchmark is built upon a robust data generation pipeline.<n>We exploit the persistent human-agent "Cognitive Gap" in interactive perception, memory, decision-making, and action.
Score: 39.68941971572086
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid evolution of GUI-enabled agents has rendered traditional CAPTCHAs obsolete. While previous benchmarks like OpenCaptchaWorld established a baseline for evaluating multimodal agents, recent advancements in reasoning-heavy models, such as Gemini3-Pro-High and GPT-5.2-Xhigh have effectively collapsed this security barrier, achieving pass rates as high as 90% on complex logic puzzles like "Bingo". In response, we introduce Next-Gen CAPTCHAs, a scalable defense framework designed to secure the next-generation web against the advanced agents. Unlike static datasets, our benchmark is built upon a robust data generation pipeline, allowing for large-scale and easily scalable evaluations, notably, for backend-supported types, our system is capable of generating effectively unbounded CAPTCHA instances. We exploit the persistent human-agent "Cognitive Gap" in interactive perception, memory, decision-making, and action. By engineering dynamic tasks that require adaptive intuition rather than granular planning, we re-establish a robust distinction between biological users and artificial agents, offering a scalable and diverse defense mechanism for the agentic era.

Related papers

AI-NativeBench: An Open-Source White-Box Agentic Benchmark Suite for AI-Native Systems [52.65695508605237]
We introduce AI-NativeBench, the first application-centric and white-box AI-Native benchmark suite grounded in Model Context Protocol (MCP) and Agent-to-Agent (A2A) standards.<n>By treating agentic spans as first-class citizens within distributed traces, our methodology enables granular analysis of engineering characteristics beyond simple capabilities.<n>This work provides the first systematic evidence to guide the transition from measuring model capability to engineering reliable AI-Native systems.
arXiv Detail & Related papers (2026-01-14T11:32:07Z)
Towards Efficient Agents: A Co-Design of Inference Architecture and System [66.59916327634639]
This paper presents AgentInfer, a unified framework for end-to-end agent acceleration.<n>We decompose the problem into four synergistic components: AgentCollab, AgentSched, AgentSAM, and AgentCompress.<n>Experiments on the BrowseComp-zh and DeepDiver benchmarks demonstrate that through the synergistic collaboration of these methods, AgentInfer reduces ineffective token consumption by over 50%.
arXiv Detail & Related papers (2025-12-20T12:06:13Z)
SCOPE: Prompt Evolution for Enhancing Agent Effectiveness [53.75986399936395]
Large Language Model (LLM) agents are increasingly deployed in environments that generate massive, dynamic contexts.<n>While agents have access to this context, their static prompts lack the mechanisms to manage it effectively.<n>We introduce textbfSCOPE (Self-evolving Context Optimization via Prompt Evolution)<n>We propose a Dual-Stream mechanism that balances tactical specificity (resolving immediate errors) with strategic generality (evolving long-term principles)
arXiv Detail & Related papers (2025-12-17T12:25:05Z)
The Evolution of Agentic AI in Cybersecurity: From Single LLM Reasoners to Multi-Agent Systems and Autonomous Pipelines [0.0]
Cybersecurity has become one of the earliest adopters of agentic AI.<n>This survey presents a five-generation taxonomy of agentic AI in cybersecurity.
arXiv Detail & Related papers (2025-12-07T05:10:16Z)
Chameleon: Adaptive Adversarial Agents for Scaling-Based Visual Prompt Injection in Multimodal AI Systems [0.0]
We propose a novel, adaptive adversarial framework designed to expose and exploit scaling vulnerabilities in production Vision-Language Models (VLMs)<n>Our experiments demonstrate that Chameleon achieves an Attack Success Rate (ASR) of 84.5% across varying scaling factors.<n>We show that these attacks effectively compromise agentic pipelines, reducing decision-making accuracy by over 45% in multi-step tasks.
arXiv Detail & Related papers (2025-12-04T15:22:28Z)
RoBCtrl: Attacking GNN-Based Social Bot Detectors via Reinforced Manipulation of Bots Control Interaction [51.46634975923564]
This paper proposes the first adversarial multi-agent Reinforcement learning framework for social Bot control attacks (RoBCtrl)<n> Specifically, we use a diffusion model to generate high-fidelity bot accounts by reconstructing existing account data with minor modifications.<n>We then employ a Multi-Agent Reinforcement Learning (MARL) method to simulate bots adversarial behavior.
arXiv Detail & Related papers (2025-10-16T02:41:49Z)
Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation [15.668734718800065]
We present a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs.<n>Unlike existing CAPTCHAs which rely on low-level perception tasks that are vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, and mental rotation.<n> Evaluation on a corresponding benchmark, Spatial-CAPTCHA-Bench, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0% Pass@1 accuracy.
arXiv Detail & Related papers (2025-10-04T16:19:21Z)
A Hybrid CAPTCHA Combining Generative AI with Keystroke Dynamics for Enhanced Bot Detection [0.0]
This paper introduces a novel hybrid CAPTCHA system that synergizes the cognitive challenges posed by Large Language Models (LLMs) with the behavioral biometric analysis of keystroke dynamics.<n>Our approach generates dynamic, unpredictable questions that are trivial for humans but non-trivial for automated agents, while simultaneously analyzing the user's typing rhythm to distinguish human patterns from robotic input.
arXiv Detail & Related papers (2025-09-29T17:56:13Z)
WebCoT: Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback [78.55946306325914]
We identify key reasoning skills essential for effective web agents.<n>We reconstruct the agent's reasoning algorithms into chain-of-thought rationales.<n>Our approach yields significant improvements across multiple benchmarks.
arXiv Detail & Related papers (2025-05-26T14:03:37Z)
BounTCHA: A CAPTCHA Utilizing Boundary Identification in Guided Generative AI-extended Videos [4.873950690073118]
Bots have increasingly been able to bypass most existing CAPTCHA systems, posing significant security threats to web applications.<n>We design and implement BounTCHA, a CAPTCHA mechanism that leverages human perception of boundaries in video transitions and disruptions.<n>We develop a prototype and conduct experiments to collect data on humans' time biases in boundary identification.
arXiv Detail & Related papers (2025-01-30T18:38:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.