Related papers: An Intelligent Hybrid Model for Identity Document Classification

An Intelligent Hybrid Model for Identity Document Classification

URL: http://arxiv.org/abs/2106.04345v1
Date: Mon, 7 Jun 2021 13:08:00 GMT
Title: An Intelligent Hybrid Model for Identity Document Classification
Authors: Nouna Khandan
Abstract summary: Digitization may provide opportunities (e.g., increase in productivity, disaster recovery, and environmentally friendly solutions) and challenges for businesses. One of the main challenges would be to accurately classify numerous scanned documents uploaded every day by customers. There are not many studies available to address the challenge as an application of image classification. The proposed approach has been implemented using Python and experimentally validated on synthetic and real-world datasets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Digitization, i.e., the process of converting information into a digital format, may provide various opportunities (e.g., increase in productivity, disaster recovery, and environmentally friendly solutions) and challenges for businesses. In this context, one of the main challenges would be to accurately classify numerous scanned documents uploaded every day by customers as usual business processes. For example, processes in banking (e.g., applying for loans) or the Government Registry of BDM (Births, Deaths, and Marriages) applications may involve uploading several documents such as a driver's license and passport. There are not many studies available to address the challenge as an application of image classification. Although some studies are available which used various methods, a more accurate model is still required. The current study has proposed a robust fusion model to define the type of identity documents accurately. The proposed approach is based on two different methods in which images are classified based on their visual features and text features. A novel model based on statistics and regression has been proposed to calculate the confidence level for the feature-based classifier. A fuzzy-mean fusion model has been proposed to combine the classifier results based on their confidence score. The proposed approach has been implemented using Python and experimentally validated on synthetic and real-world datasets. The performance of the proposed model is evaluated using the Receiver Operating Characteristic (ROC) curve analysis.

Related papers

DocVCE: Diffusion-based Visual Counterfactual Explanations for Document Image Classification [5.247930659596986]
We introduce generative document counterfactuals that provide meaningful insights into the model's decision-making through actionable explanations.<n>To the best of the authors' knowledge, this is the first work to explore generative counterfactual explanations in document image analysis.
arXiv Detail & Related papers (2025-08-06T09:15:32Z)
Queries, Representation & Detection: The Next 100 Model Fingerprinting Schemes [4.993542259120313]
This paper introduces a systematic approach to the creation of model fingerprinting schemes and their evaluation benchmarks. We identify $sim100$ previously unexplored QuRD combinations and gain insights into their performance. Our approach reveals the need for more challenging benchmarks and a sound comparison with baselines.
arXiv Detail & Related papers (2024-12-17T15:41:36Z)
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models [51.067146460271466]
Evaluation of visual generative models can be time-consuming and computationally expensive. We propose the Evaluation Agent framework, which employs human-like strategies for efficient, dynamic, multi-round evaluations. It offers four key advantages: 1) efficiency, 2) promptable evaluation tailored to diverse user needs, 3) explainability beyond single numerical scores, and 4) scalability across various models and tools.
arXiv Detail & Related papers (2024-12-10T18:52:39Z)
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content [62.816876067499415]
We propose LiveXiv: a scalable evolving live benchmark based on scientific ArXiv papers. LiveXiv accesses domain-specific manuscripts at any given timestamp and proposes to automatically generate visual question-answer pairs. We benchmark multiple open and proprietary Large Multi-modal Models (LMMs) on the first version of our benchmark, showing its challenging nature and exposing the models true abilities.
arXiv Detail & Related papers (2024-10-14T17:51:23Z)
SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift. We propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment. Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust. Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model. We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z)
Resources for Brewing BEIR: Reproducible Reference Models and an Official Leaderboard [47.73060223236792]
BEIR is a benchmark dataset for evaluation of information retrieval models across 18 different domain/task combinations. Our work addresses two shortcomings that prevent the benchmark from achieving its full potential.
arXiv Detail & Related papers (2023-06-13T00:26:18Z)
GVdoc: Graph-based Visual Document Classification [17.350393956461783]
We propose GVdoc, a graph-based document classification model. Our approach generates a document graph based on its layout, and then trains a graph neural network to learn node and graph embeddings. We show that our model, even with fewer parameters, outperforms state-of-the-art models on out-of-distribution data.
arXiv Detail & Related papers (2023-05-26T19:23:20Z)
A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search. We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z)
Logically at the Factify 2022: Multimodal Fact Verification [2.8914815569249823]
This paper describes our participant system for the multi-modal fact verification (Factify) challenge at AAAI 2022. Two baseline approaches are proposed and explored including an ensemble model and a multi-modal attention network. Our best model is ranked first in leaderboard which obtains a weighted average F-measure of 0.77 on both validation and test set.
arXiv Detail & Related papers (2021-12-16T23:34:07Z)
Incorporating Vision Bias into Click Models for Image-oriented Search Engine [51.192784793764176]
In this paper, we assume that vision bias exists in an image-oriented search engine as another crucial factor affecting the examination probability aside from position. We use regression-based EM algorithm to predict the vision bias given the visual features extracted from candidate documents.
arXiv Detail & Related papers (2021-01-07T10:01:31Z)
DGSAC: Density Guided Sampling and Consensus [4.808421423598809]
Kernel Residual Density is a key differentiator between inliers and outliers. We propose two model selection algorithms, an optimal quadratic program based, and a greedy. We evaluate our method on a wide variety of tasks like planar segmentation, motion segmentation, vanishing point estimation, plane fitting to 3D point cloud, line, and circle fitting.
arXiv Detail & Related papers (2020-06-03T17:42:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.