OCR Graph Features for Manipulation Detection in Documents
- URL: http://arxiv.org/abs/2009.05158v2
- Date: Mon, 14 Sep 2020 15:52:09 GMT
- Title: OCR Graph Features for Manipulation Detection in Documents
- Authors: Hailey James, Otkrist Gupta, Dan Raviv
- Abstract summary: We propose a model that leverages graph features using OCR (Optical Character Recognition)
Our model relies on a data-driven approach to detect alterations by training a random forest classifier on the graph-based OCR features.
We evaluate our algorithm's forgery detection performance on dataset constructed from real business documents with slight forgery imperfections.
- Score: 11.193867567895353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting manipulations in digital documents is becoming increasingly
important for information verification purposes. Due to the proliferation of
image editing software, altering key information in documents has become widely
accessible. Nearly all approaches in this domain rely on a procedural approach,
using carefully generated features and a hand-tuned scoring system, rather than
a data-driven and generalizable approach. We frame this issue as a graph
comparison problem using the character bounding boxes, and propose a model that
leverages graph features using OCR (Optical Character Recognition). Our model
relies on a data-driven approach to detect alterations by training a random
forest classifier on the graph-based OCR features. We evaluate our algorithm's
forgery detection performance on dataset constructed from real business
documents with slight forgery imperfections. Our proposed model dramatically
outperforms the most closely-related document manipulation detection model on
this task.
Related papers
- Optimization of Image Processing Algorithms for Character Recognition in
Cultural Typewritten Documents [0.8158530638728501]
This paper evaluates the impact of image processing methods and parameter tuning in Optical Character Recognition (OCR)
The approach uses a multi-objective problem formulation to minimize Levenshtein edit distance and maximize the number of words correctly identified with a non-dominated sorting genetic algorithm (NSGA-II)
Our findings suggest that employing image pre-processing algorithms in OCR might be more suitable for typologies where the text recognition task without pre-processing does not produce good results.
arXiv Detail & Related papers (2023-11-27T11:44:46Z) - Similar Document Template Matching Algorithm [0.0]
This study outlines a comprehensive methodology for verifying medical documents.
It integrates advanced techniques in template extraction, comparison, and fraud detection.
This methodology provides a robust approach to medical document verification.
arXiv Detail & Related papers (2023-11-21T15:13:18Z) - DECDM: Document Enhancement using Cycle-Consistent Diffusion Models [3.3813766129849845]
We propose DECDM, an end-to-end document-level image translation method inspired by recent advances in diffusion models.
Our method overcomes the limitations of paired training by independently training the source (noisy input) and target (clean output) models.
We also introduce simple data augmentation strategies to improve character-glyph conservation during translation.
arXiv Detail & Related papers (2023-11-16T07:16:02Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - SelfDocSeg: A Self-Supervised vision-based Approach towards Document
Segmentation [15.953725529361874]
Document layout analysis is a known problem to the documents research community.
With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain.
We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches.
arXiv Detail & Related papers (2023-05-01T12:47:55Z) - ClipCrop: Conditioned Cropping Driven by Vision-Language Model [90.95403416150724]
We take advantage of vision-language models as a foundation for creating robust and user-intentional cropping algorithms.
We develop a method to perform cropping with a text or image query that reflects the user's intention as guidance.
Our pipeline design allows the model to learn text-conditioned aesthetic cropping with a small dataset.
arXiv Detail & Related papers (2022-11-21T14:27:07Z) - Augraphy: A Data Augmentation Library for Document Images [59.457999432618614]
Augraphy is a Python library for constructing data augmentation pipelines.
It provides strategies to produce augmented versions of clean document images that appear to have been altered by standard office operations.
arXiv Detail & Related papers (2022-08-30T22:36:19Z) - ObjectFormer for Image Manipulation Detection and Localization [118.89882740099137]
We propose ObjectFormer to detect and localize image manipulations.
We extract high-frequency features of the images and combine them with RGB features as multimodal patch embeddings.
We conduct extensive experiments on various datasets and the results verify the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-03-28T12:27:34Z) - DocScanner: Robust Document Image Rectification with Progressive
Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification.
DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture.
The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z) - One-shot Key Information Extraction from Document with Deep Partial
Graph Matching [60.48651298832829]
Key Information Extraction (KIE) from documents improves efficiency, productivity, and security in many industrial scenarios.
Existing supervised learning methods for the KIE task need to feed a large number of labeled samples and learn separate models for different types of documents.
We propose a deep end-to-end trainable network for one-shot KIE using partial graph matching.
arXiv Detail & Related papers (2021-09-26T07:45:53Z) - Key Information Extraction From Documents: Evaluation And Generator [3.878105750489656]
This research project compares state-of-the-art models for information extraction from documents.
The results have shown that NLP based pre-processing is beneficial for model performance.
The use of a bounding box regression decoder increases the model performance only for fields that do not follow a rectangular shape.
arXiv Detail & Related papers (2021-06-09T16:12:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.