Related papers: Source Printer Identification from Document Images Acquired using Smartphone

Source Printer Identification from Document Images Acquired using Smartphone

URL: http://arxiv.org/abs/2003.12602v1
Date: Fri, 27 Mar 2020 18:59:32 GMT
Title: Source Printer Identification from Document Images Acquired using Smartphone
Authors: Sharad Joshi, Suraj Saxena, Nitin Khanna
Abstract summary: We propose to learn a single CNN model from the fusion of letter images and their printer-specific noise residuals. The proposed method achieves 98.42% document classification accuracy using images of letter 'e' under a 5x2 cross-validation approach.
Score: 14.889347839830092
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vast volumes of printed documents continue to be used for various important as well as trivial applications. Such applications often rely on the information provided in the form of printed text documents whose integrity verification poses a challenge due to time constraints and lack of resources. Source printer identification provides essential information about the origin and integrity of a printed document in a fast and cost-effective manner. Even when fraudulent documents are identified, information about their origin can help stop future frauds. If a smartphone camera replaces scanner for the document acquisition process, document forensics would be more economical, user-friendly, and even faster in many applications where remote and distributed analysis is beneficial. Building on existing methods, we propose to learn a single CNN model from the fusion of letter images and their printer-specific noise residuals. In the absence of any publicly available dataset, we created a new dataset consisting of 2250 document images of text documents printed by eighteen printers and acquired by a smartphone camera at five acquisition settings. The proposed method achieves 98.42% document classification accuracy using images of letter 'e' under a 5x2 cross-validation approach. Further, when tested using about half a million letters of all types, it achieves 90.33% and 98.01% letter and document classification accuracies, respectively, thus highlighting the ability to learn a discriminative model without dependence on a single letter type. Also, classification accuracies are encouraging under various acquisition settings, including low illumination and change in angle between the document and camera planes.

Related papers

Unified Multi-Modal Interleaved Document Representation for Information Retrieval [57.65409208879344]
We produce more comprehensive and nuanced document representations by holistically embedding documents interleaved with different modalities. Specifically, we achieve this by leveraging the capability of recent vision-language models that enable the processing and integration of text, images, and tables into a unified format and representation.
arXiv Detail & Related papers (2024-10-03T17:49:09Z)
IDNet: A Novel Dataset for Identity Document Analysis and Fraud Detection [25.980165854663145]
IDNet is a benchmark dataset designed to advance privacy-preserving fraud detection efforts. It comprises 837,060 images of synthetically generated identity documents, totaling approximately 490 gigabytes. We evaluate the utility and present use cases of the dataset, illustrating how it can aid in training privacy-preserving fraud detection methods.
arXiv Detail & Related papers (2024-08-03T07:05:40Z)
DocXPand-25k: a large and diverse benchmark dataset for identity documents analysis [0.0]
Identity document (ID) image analysis has become essential for many online services, like bank account opening or insurance subscription. There are only a few available to benchmark ID analysis methods, mainly because of privacy restrictions, security requirements and legal reasons. We present the DocXPand-25k dataset, which consists of 24,994 richly labeled IDs images.
arXiv Detail & Related papers (2024-07-30T08:55:27Z)
Unifying Multimodal Retrieval via Document Screenshot Embedding [92.03571344075607]
Document Screenshot Embedding (DSE) is a novel retrieval paradigm that regards document screenshots as a unified input format. We first craft the dataset of Wiki-SS, a 1.3M Wikipedia web page screenshots as the corpus to answer the questions from the Natural Questions dataset. In such a text-intensive document retrieval setting, DSE shows competitive effectiveness compared to other text retrieval methods relying on parsing.
arXiv Detail & Related papers (2024-06-17T06:27:35Z)
Watermark Text Pattern Spotting in Document Images [3.6298655794854464]
In the wild, writing can come in various fonts, sizes and forms, making generic recognition a very difficult problem. We propose a novel benchmark (K-Watermark) containing 65,447 data samples generated using Wrender. A validity study using humans raters yields an authenticity score of 0.51 against pre-generated watermarked documents.
arXiv Detail & Related papers (2024-01-10T14:02:45Z)
DocMAE: Document Image Rectification via Self-supervised Representation Learning [144.44748607192147]
We present DocMAE, a novel self-supervised framework for document image rectification. We first mask random patches of the background-excluded document images and then reconstruct the missing pixels. With such a self-supervised learning approach, the network is encouraged to learn the intrinsic structure of deformed documents.
arXiv Detail & Related papers (2023-04-20T14:27:15Z)
Deep Unrestricted Document Image Rectification [110.61517455253308]
We present DocTr++, a novel unified framework for document image rectification. We upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing. We contribute a real-world test set and metrics applicable for evaluating the rectification quality.
arXiv Detail & Related papers (2023-04-18T08:00:54Z)
Open Set Classification of Untranscribed Handwritten Documents [56.0167902098419]
Huge amounts of digital page images of important manuscripts are preserved in archives worldwide. The class or typology'' of a document is perhaps the most important tag to be included in the metadata. The technical problem is one of automatic classification of documents, each consisting of a set of untranscribed handwritten text images.
arXiv Detail & Related papers (2022-06-20T20:43:50Z)
DocScanner: Robust Document Image Rectification with Progressive Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification. DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture. The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z)
MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis [48.35030471041193]
MIDV-2020 consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents. With 72409 annotated images in total, to the date of publication the proposed dataset is the largest publicly available identity documents dataset.
arXiv Detail & Related papers (2021-07-01T12:14:17Z)
An Automatic Reader of Identity Documents [0.0]
This paper presents the prototype of a novel automatic reading system of identity documents. The system has been thought to extract data of the main Italian identity documents from photographs of acceptable quality. The document is first localized inside the photo, and then classified; finally, text recognition is executed.
arXiv Detail & Related papers (2020-06-26T08:22:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.