Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation
- URL: http://arxiv.org/abs/2510.03863v1
- Date: Sat, 04 Oct 2025 16:19:21 GMT
- Title: Spatial CAPTCHA: Generatively Benchmarking Spatial Reasoning for Human-Machine Differentiation
- Authors: Arina Kharlamova, Bowei He, Chen Ma, Xue Liu,
- Abstract summary: We present a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs.<n>Unlike existing CAPTCHAs which rely on low-level perception tasks that are vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, and mental rotation.<n> Evaluation on a corresponding benchmark, Spatial-CAPTCHA-Bench, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0% Pass@1 accuracy.
- Score: 15.668734718800065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online services rely on CAPTCHAs as a first line of defense against automated abuse, yet recent advances in multi-modal large language models (MLLMs) have eroded the effectiveness of conventional designs that focus on text recognition or 2D image understanding. To address this challenge, we present Spatial CAPTCHA, a novel human-verification framework that leverages fundamental differences in spatial reasoning between humans and MLLMs. Unlike existing CAPTCHAs which rely on low-level perception tasks that are vulnerable to modern AI, Spatial CAPTCHA generates dynamic questions requiring geometric reasoning, perspective-taking, occlusion handling, and mental rotation. These skills are intuitive for humans but difficult for state-of-the-art (SOTA) AI systems. The system employs a procedural generation pipeline with constraint-based difficulty control, automated correctness verification, and human-in-the-loop validation to ensure scalability, robustness, and adaptability. Evaluation on a corresponding benchmark, Spatial-CAPTCHA-Bench, demonstrates that humans vastly outperform 10 state-of-the-art MLLMs, with the best model achieving only 31.0% Pass@1 accuracy. Furthermore, we compare Spatial CAPTCHA with Google reCAPTCHA, which confirms its effectiveness as both a security mechanism and a diagnostic tool for spatial reasoning in AI.
Related papers
- COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers [17.70082722524941]
multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA.<n>We evaluate 7 leading commercial and open-source MLLMs across 18 real-world CAPTCHA task types.<n>We reveal that MLLMs can reliably solve recognition-oriented and low-interaction CAPTCHA tasks at human-like cost and latency.
arXiv Detail & Related papers (2025-12-02T01:23:10Z) - A Hybrid CAPTCHA Combining Generative AI with Keystroke Dynamics for Enhanced Bot Detection [0.0]
This paper introduces a novel hybrid CAPTCHA system that synergizes the cognitive challenges posed by Large Language Models (LLMs) with the behavioral biometric analysis of keystroke dynamics.<n>Our approach generates dynamic, unpredictable questions that are trivial for humans but non-trivial for automated agents, while simultaneously analyzing the user's typing rhythm to distinguish human patterns from robotic input.
arXiv Detail & Related papers (2025-09-29T17:56:13Z) - Explainable AI for Collaborative Assessment of 2D/3D Registration Quality [50.65650507103078]
We propose the first artificial intelligence framework trained specifically for 2D/3D registration quality verification.<n>Our explainable AI (XAI) approach aims to enhance informed decision-making for human operators.
arXiv Detail & Related papers (2025-07-23T15:28:57Z) - Defensive Adversarial CAPTCHA: A Semantics-Driven Framework for Natural Adversarial Example Generation [48.60492738839292]
Traditional CAPTCHA schemes are increasingly vulnerable to automated attacks powered by deep neural networks (DNNs)<n>We propose the Unsourced Adversarial CAPTCHA (DAC), a novel framework that generates high-specified adversarial examples.<n>In untargeted attacks, especially for black-box scenarios, we introduce bi-path unsourced adversarial CAPTCHA (BP-DAC)
arXiv Detail & Related papers (2025-06-12T13:30:01Z) - Perceptual Quality Assessment for Embodied AI [66.96928199019129]
Embodied AI has developed rapidly in recent years, but it is still mainly deployed in laboratories.<n>There is no IQA method to assess the usability of an image in embodied tasks, namely, the perceptual quality for robots.
arXiv Detail & Related papers (2025-05-22T15:51:07Z) - IllusionCAPTCHA: A CAPTCHA based on Visual Illusion [14.043017273813227]
We present IllusionCAPTCHA, a novel security mechanism employing the "Human-Easy but AI-Hard" paradigm.<n>Results from our user study indicate that 86.95% of participants successfully passed the CAPTCHA on their first attempt, outperforming other CAPTCHA systems.
arXiv Detail & Related papers (2025-02-08T06:03:03Z) - Oedipus: LLM-enchanced Reasoning CAPTCHA Solver [17.074422329618212]
Oedipus is an innovative end-to-end framework for automated reasoning CAPTCHA solving.
Central to this framework is a novel strategy that dissects the complex and human-easy-AI-hard tasks into a sequence of simpler and AI-easy steps.
Our evaluation shows that Oedipus effectively resolves the studied CAPTCHAs, achieving an average success rate of 63.5%.
arXiv Detail & Related papers (2024-05-13T06:32:57Z) - A Survey of Adversarial CAPTCHAs on its History, Classification and
Generation [69.36242543069123]
We extend the definition of adversarial CAPTCHAs and propose a classification method for adversarial CAPTCHAs.
Also, we analyze some defense methods that can be used to defend adversarial CAPTCHAs, indicating potential threats to adversarial CAPTCHAs.
arXiv Detail & Related papers (2023-11-22T08:44:58Z) - Task-Specific Normalization for Continual Learning of Blind Image
Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA)
The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability.
We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score.
The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z) - Robust Text CAPTCHAs Using Adversarial Examples [129.29523847765952]
We propose a user-friendly text-based CAPTCHA generation method named Robust Text CAPTCHA (RTC)
At the first stage, the foregrounds and backgrounds are constructed with randomly sampled font and background images.
At the second stage, we apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers.
arXiv Detail & Related papers (2021-01-07T11:03:07Z) - Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability
assessment [1.027974860479791]
This research investigates the weaknesses and vulnerabilities of the CAPTCHA generator systems.
We develop a Convolutional Neural Network called Deep-CAPTCHA to achieve this goal.
Our network's cracking accuracy leads to a high rate of 98.94% and 98.31% for the numerical and the alpha-numerical test datasets.
arXiv Detail & Related papers (2020-06-15T11:44:43Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.