Aura-CAPTCHA: A Reinforcement Learning and GAN-Enhanced Multi-Modal CAPTCHA System
- URL: http://arxiv.org/abs/2508.14976v1
- Date: Wed, 20 Aug 2025 18:00:08 GMT
- Title: Aura-CAPTCHA: A Reinforcement Learning and GAN-Enhanced Multi-Modal CAPTCHA System
- Authors: Joydeep Chandra, Prabal Manhas, Ramanjot Kaur, Rashi Sahay,
- Abstract summary: Aura-CAPTCHA was developed as a multi-modal CAPTCHA system to address vulnerabilities in traditional methods.<n>The design integrated Generative Adrial Networks (GANs) for generating dynamic image challenges, Reinforcement Learning (RL) for adaptive difficulty tuning, and Large Language Models (LLMs) for creating text and audio prompts.
- Score: 1.4305544869388402
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Aura-CAPTCHA was developed as a multi-modal CAPTCHA system to address vulnerabilities in traditional methods that are increasingly bypassed by AI technologies, such as Optical Character Recognition (OCR) and adversarial image processing. The design integrated Generative Adversarial Networks (GANs) for generating dynamic image challenges, Reinforcement Learning (RL) for adaptive difficulty tuning, and Large Language Models (LLMs) for creating text and audio prompts. Visual challenges included 3x3 grid selections with at least three correct images, while audio challenges combined randomized numbers and words into a single task. RL adjusted difficulty based on incorrect attempts, response time, and suspicious user behavior. Evaluations on real-world traffic demonstrated a 92% human success rate and a 10% bot bypass rate, significantly outperforming existing CAPTCHA systems. The system provided a robust and scalable approach for securing online applications while remaining accessible to users, addressing gaps highlighted in previous research.
Related papers
- Robust CAPTCHA Using Audio Illusions in the Era of Large Language Models: from Evaluation to Advances [21.1525767544373]
We introduce AI-CAPTCHA, a unified framework that offers an evaluation framework, ACEval, and a novel audio CAPTCHA approach, IllusionAudio.<n>We show that most existing methods can be solved with high success rates by advanced LALMs and ASR models, exposing critical security weaknesses.<n>To address these vulnerabilities, we design a new audio CAPTCHA approach, IllusionAudio, which exploits perceptual illusion cues rooted in human auditory mechanisms.
arXiv Detail & Related papers (2026-01-13T13:00:06Z) - NGCaptcha: A CAPTCHA Bridging the Past and the Future [12.897322340526769]
We introduce NGCAPTCHA, a Next Generation CAPTCHA framework that integrates a lightweight client side proof of work (PoW) mechanism with an AI resistant visual recognition challenge.<n>In NGCAPTCHA, a browser must first complete a small hash based PoW before any challenge is displayed.<n>Once the PoW is solved, the user is presented with a human friendly yet model resistant image selection task.
arXiv Detail & Related papers (2025-12-18T06:14:54Z) - Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios [54.07895223545793]
This paper introduces the Real-World Robustness dataset (RRDataset) for comprehensive evaluation of detection models across three dimensions.<n>RRDataset includes high-quality images from seven major scenarios.<n>We benchmarked 17 detectors and 10 vision-language models (VLMs) on RRDataset and conducted a large-scale human study.
arXiv Detail & Related papers (2025-09-11T06:15:52Z) - Defensive Adversarial CAPTCHA: A Semantics-Driven Framework for Natural Adversarial Example Generation [48.60492738839292]
Traditional CAPTCHA schemes are increasingly vulnerable to automated attacks powered by deep neural networks (DNNs)<n>We propose the Unsourced Adversarial CAPTCHA (DAC), a novel framework that generates high-specified adversarial examples.<n>In untargeted attacks, especially for black-box scenarios, we introduce bi-path unsourced adversarial CAPTCHA (BP-DAC)
arXiv Detail & Related papers (2025-06-12T13:30:01Z) - Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents [23.715342148854006]
Open CaptchaWorld is the first web-based benchmark and platform specifically designed to evaluate the visual reasoning and interaction capabilities of MLLM-powered agents.<n>Results show that humans consistently achieve near-perfect scores, state-of-the-art MLLM agents struggle significantly, with success rates at most 40.0% by Browser-Use Openai-o3.<n>This highlights Open CaptchaWorld as a vital benchmark for diagnosing the limits of current multimodal agents and guiding the development of more robust multimodal reasoning systems.
arXiv Detail & Related papers (2025-05-30T17:59:55Z) - Oedipus: LLM-enchanced Reasoning CAPTCHA Solver [17.074422329618212]
Oedipus is an innovative end-to-end framework for automated reasoning CAPTCHA solving.
Central to this framework is a novel strategy that dissects the complex and human-easy-AI-hard tasks into a sequence of simpler and AI-easy steps.
Our evaluation shows that Oedipus effectively resolves the studied CAPTCHAs, achieving an average success rate of 63.5%.
arXiv Detail & Related papers (2024-05-13T06:32:57Z) - A Survey of Adversarial CAPTCHAs on its History, Classification and
Generation [69.36242543069123]
We extend the definition of adversarial CAPTCHAs and propose a classification method for adversarial CAPTCHAs.
Also, we analyze some defense methods that can be used to defend adversarial CAPTCHAs, indicating potential threats to adversarial CAPTCHAs.
arXiv Detail & Related papers (2023-11-22T08:44:58Z) - Diff-CAPTCHA: An Image-based CAPTCHA with Security Enhanced by Denoising
Diffusion Model [2.1551899143698328]
Diff-CAPTCHA is an image-click CAPTCHA scheme based on diffusion models.
This paper develops several attack methods, including end-to-end attacks based on Faster R-CNN and two-stage attacks.
Results show that diffusion models can effectively enhance CAPTCHA security while maintaining good usability in human testing.
arXiv Detail & Related papers (2023-08-16T13:41:29Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Feeling of Presence Maximization: mmWave-Enabled Virtual Reality Meets
Deep Reinforcement Learning [76.46530937296066]
This paper investigates the problem of providing ultra-reliable and energy-efficient virtual reality (VR) experiences for wireless mobile users.
To ensure reliable ultra-high-definition (UHD) video frame delivery to mobile users, a coordinated multipoint (CoMP) transmission technique and millimeter wave (mmWave) communications are exploited.
arXiv Detail & Related papers (2021-06-03T08:35:10Z) - Robust Text CAPTCHAs Using Adversarial Examples [129.29523847765952]
We propose a user-friendly text-based CAPTCHA generation method named Robust Text CAPTCHA (RTC)
At the first stage, the foregrounds and backgrounds are constructed with randomly sampled font and background images.
At the second stage, we apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers.
arXiv Detail & Related papers (2021-01-07T11:03:07Z) - Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability
assessment [1.027974860479791]
This research investigates the weaknesses and vulnerabilities of the CAPTCHA generator systems.
We develop a Convolutional Neural Network called Deep-CAPTCHA to achieve this goal.
Our network's cracking accuracy leads to a high rate of 98.94% and 98.31% for the numerical and the alpha-numerical test datasets.
arXiv Detail & Related papers (2020-06-15T11:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.