Segmentation-free Connectionist Temporal Classification loss based OCR
Model for Text Captcha Classification
- URL: http://arxiv.org/abs/2402.05417v1
- Date: Thu, 8 Feb 2024 05:18:11 GMT
- Title: Segmentation-free Connectionist Temporal Classification loss based OCR
Model for Text Captcha Classification
- Authors: Vaibhav Khatavkar, Makarand Velankar and Sneha Petkar
- Abstract summary: We propose a segmentation-free OCR model for text captcha classification based on the connectionist temporal classification loss technique.
The accuracy of the proposed model is compared with the state-of-the-art models and proves to be effective.
- Score: 7.37329190948762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Captcha are widely used to secure systems from automatic responses by
distinguishing computer responses from human responses. Text, audio, video,
picture picture-based Optical Character Recognition (OCR) are used for creating
captcha. Text-based OCR captcha are the most often used captcha which faces
issues namely, complex and distorted contents. There are attempts to build
captcha detection and classification-based systems using machine learning and
neural networks, which need to be tuned for accuracy. The existing systems face
challenges in the recognition of distorted characters, handling variable-length
captcha and finding sequential dependencies in captcha. In this work, we
propose a segmentation-free OCR model for text captcha classification based on
the connectionist temporal classification loss technique. The proposed model is
trained and tested on a publicly available captcha dataset. The proposed model
gives 99.80\% character level accuracy, while 95\% word level accuracy. The
accuracy of the proposed model is compared with the state-of-the-art models and
proves to be effective. The variable length complex captcha can be thus
processed with the segmentation-free connectionist temporal classification loss
technique with dependencies which will be massively used in securing the
software systems.
Related papers
- Breaking reCAPTCHAv2 [20.706469085872516]
We evaluate the effectiveness of automated systems in solving captchas by utilizing advanced YOLO models for image segmentation and classification.
Our findings suggest that there is no significant difference in the number of challenges humans and bots must solve to pass the captchas in reCAPTCHAv2.
arXiv Detail & Related papers (2024-09-13T13:47:12Z) - Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through
Image-IDS Aligning [61.34060587461462]
We propose a two-stage framework for Chinese Text Recognition (CTR)
We pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS)
This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character.
The learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition.
arXiv Detail & Related papers (2023-09-03T05:33:16Z) - Diff-CAPTCHA: An Image-based CAPTCHA with Security Enhanced by Denoising
Diffusion Model [2.1551899143698328]
Diff-CAPTCHA is an image-click CAPTCHA scheme based on diffusion models.
This paper develops several attack methods, including end-to-end attacks based on Faster R-CNN and two-stage attacks.
Results show that diffusion models can effectively enhance CAPTCHA security while maintaining good usability in human testing.
arXiv Detail & Related papers (2023-08-16T13:41:29Z) - Context Perception Parallel Decoder for Scene Text Recognition [52.620841341333524]
Scene text recognition methods have struggled to attain high accuracy and fast inference speed.
We present an empirical study of AR decoding in STR, and discover that the AR decoder not only models linguistic context, but also provides guidance on visual context perception.
We construct a series of CPPD models and also plug the proposed modules into existing STR decoders. Experiments on both English and Chinese benchmarks demonstrate that the CPPD models achieve highly competitive accuracy while running approximately 8x faster than their AR-based counterparts.
arXiv Detail & Related papers (2023-07-23T09:04:13Z) - DPIC: Decoupling Prompt and Intrinsic Characteristics for LLM Generated Text Detection [56.513637720967566]
Large language models (LLMs) can generate texts that pose risks of misuse, such as plagiarism, planting fake reviews on e-commerce platforms, or creating inflammatory false tweets.
Existing high-quality detection methods usually require access to the interior of the model to extract the intrinsic characteristics.
We propose to extract deep intrinsic characteristics of the black-box model generated texts.
arXiv Detail & Related papers (2023-05-21T17:26:16Z) - Vulnerability analysis of captcha using Deep learning [0.0]
This research investigates the flaws and vulnerabilities in the CAPTCHA generating systems.
To achieve this, we created CapNet, a Convolutional Neural Network.
The proposed platform can evaluate both numerical and alphanumerical CAPTCHAs
arXiv Detail & Related papers (2023-02-18T17:45:11Z) - Fine-grained Image Captioning with CLIP Reward [104.71533106301598]
We propose using CLIP, a multimodal encoder trained on huge image-text pairs from web, to calculate multimodal similarity and use it as a reward function.
We also propose a simple finetuning strategy of the CLIP text encoder to improve grammar that does not require extra text annotation.
In experiments on text-to-image retrieval and FineCapEval, the proposed CLIP-guided model generates more distinctive captions than the CIDEr-optimized model.
arXiv Detail & Related papers (2022-05-26T02:46:09Z) - Lexically Aware Semi-Supervised Learning for OCR Post-Correction [90.54336622024299]
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents.
Previous work has demonstrated the utility of neural post-correction methods on recognition of less-well-resourced languages.
We present a semi-supervised learning method that makes it possible to utilize raw images to improve performance.
arXiv Detail & Related papers (2021-11-04T04:39:02Z) - Robust Text CAPTCHAs Using Adversarial Examples [129.29523847765952]
We propose a user-friendly text-based CAPTCHA generation method named Robust Text CAPTCHA (RTC)
At the first stage, the foregrounds and backgrounds are constructed with randomly sampled font and background images.
At the second stage, we apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers.
arXiv Detail & Related papers (2021-01-07T11:03:07Z) - Deep-CAPTCHA: a deep learning based CAPTCHA solver for vulnerability
assessment [1.027974860479791]
This research investigates the weaknesses and vulnerabilities of the CAPTCHA generator systems.
We develop a Convolutional Neural Network called Deep-CAPTCHA to achieve this goal.
Our network's cracking accuracy leads to a high rate of 98.94% and 98.31% for the numerical and the alpha-numerical test datasets.
arXiv Detail & Related papers (2020-06-15T11:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.