DocXPand-25k: a large and diverse benchmark dataset for identity documents analysis
- URL: http://arxiv.org/abs/2407.20662v1
- Date: Tue, 30 Jul 2024 08:55:27 GMT
- Title: DocXPand-25k: a large and diverse benchmark dataset for identity documents analysis
- Authors: Julien Lerouge, Guillaume Betmont, Thomas Bres, Evgeny Stepankevich, Alexis Bergès,
- Abstract summary: Identity document (ID) image analysis has become essential for many online services, like bank account opening or insurance subscription.
There are only a few available to benchmark ID analysis methods, mainly because of privacy restrictions, security requirements and legal reasons.
We present the DocXPand-25k dataset, which consists of 24,994 richly labeled IDs images.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Identity document (ID) image analysis has become essential for many online services, like bank account opening or insurance subscription. In recent years, much research has been conducted on subjects like document localization, text recognition and fraud detection, to achieve a level of accuracy reliable enough to automatize identity verification. However, there are only a few available datasets to benchmark ID analysis methods, mainly because of privacy restrictions, security requirements and legal reasons. In this paper, we present the DocXPand-25k dataset, which consists of 24,994 richly labeled IDs images, generated using custom-made vectorial templates representing nine fictitious ID designs, including four identity cards, two residence permits and three passports designs. These synthetic IDs feature artificially generated personal information (names, dates, identifiers, faces, barcodes, ...), and present a rich diversity in the visual layouts and textual contents. We collected about 5.8k diverse backgrounds coming from real-world photos, scans and screenshots of IDs to guarantee the variety of the backgrounds. The software we wrote to generate these images has been published (https://github.com/QuickSign/docxpand/) under the terms of the MIT license, and our dataset has been published (https://github.com/QuickSign/docxpand/releases/tag/v1.0.0) under the terms of the CC-BY-NC-SA 4.0 License.
Related papers
- IDNet: A Novel Dataset for Identity Document Analysis and Fraud Detection [25.980165854663145]
IDNet is a benchmark dataset designed to advance privacy-preserving fraud detection efforts.
It comprises 837,060 images of synthetically generated identity documents, totaling approximately 490 gigabytes.
We evaluate the utility and present use cases of the dataset, illustrating how it can aid in training privacy-preserving fraud detection methods.
arXiv Detail & Related papers (2024-08-03T07:05:40Z) - Synthesizing Efficient Data with Diffusion Models for Person Re-Identification Pre-Training [51.87027943520492]
We present a novel paradigm Diffusion-ReID to efficiently augment and generate diverse images based on known identities.
Benefiting from our proposed paradigm, we first create a new large-scale person Re-ID dataset Diff-Person, which consists of over 777K images from 5,183 identities.
arXiv Detail & Related papers (2024-06-10T06:26:03Z) - ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning [57.91881829308395]
Identity-preserving text-to-image generation (ID-T2I) has received significant attention due to its wide range of application scenarios like AI portrait and advertising.
We present textbfID-Aligner, a general feedback learning framework to enhance ID-T2I performance.
arXiv Detail & Related papers (2024-04-23T18:41:56Z) - Magic-Me: Identity-Specific Video Customized Diffusion [72.05925155000165]
We propose a controllable subject identity controllable video generation framework, termed Video Custom Diffusion (VCD)
With a specified identity defined by a few images, VCD reinforces the identity characteristics and injects frame-wise correlation for stable video outputs.
We conducted extensive experiments to verify that VCD is able to generate stable videos with better ID over the baselines.
arXiv Detail & Related papers (2024-02-14T18:13:51Z) - Synthetic dataset of ID and Travel Document [1.9296797946506603]
This paper presents a new synthetic dataset of ID and travel documents, called SIDTD.
The SIDTD dataset is created to help training and evaluating forged ID documents detection systems.
arXiv Detail & Related papers (2024-01-03T18:06:28Z) - PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding [102.07914175196817]
PhotoMaker is an efficient personalized text-to-image generation method.
It encodes an arbitrary number of input ID images into a stack ID embedding for preserving ID information.
arXiv Detail & Related papers (2023-12-07T17:32:29Z) - Multiview Identifiers Enhanced Generative Retrieval [78.38443356800848]
generative retrieval generates identifier strings of passages as the retrieval target.
We propose a new type of identifier, synthetic identifiers, that are generated based on the content of a passage.
Our proposed approach performs the best in generative retrieval, demonstrating its effectiveness and robustness.
arXiv Detail & Related papers (2023-05-26T06:50:21Z) - MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document
Analysis [48.35030471041193]
MIDV-2020 consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents.
With 72409 annotated images in total, to the date of publication the proposed dataset is the largest publicly available identity documents dataset.
arXiv Detail & Related papers (2021-07-01T12:14:17Z) - A Fast Fully Octave Convolutional Neural Network for Document Image
Segmentation [1.8426817621478804]
We investigate a method based on U-Net to detect the document edges and text regions in ID images.
We propose a model optimization based on Octave Convolutions to qualify the method to situations where storage, processing, and time resources are limited.
Our results showed that the proposed models are efficient to document segmentation tasks and portable.
arXiv Detail & Related papers (2020-04-03T00:57:33Z) - Source Printer Identification from Document Images Acquired using
Smartphone [14.889347839830092]
We propose to learn a single CNN model from the fusion of letter images and their printer-specific noise residuals.
The proposed method achieves 98.42% document classification accuracy using images of letter 'e' under a 5x2 cross-validation approach.
arXiv Detail & Related papers (2020-03-27T18:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.