An Intelligent Hybrid Model for Identity Document Classification
- URL: http://arxiv.org/abs/2106.04345v1
- Date: Mon, 7 Jun 2021 13:08:00 GMT
- Title: An Intelligent Hybrid Model for Identity Document Classification
- Authors: Nouna Khandan
- Abstract summary: Digitization may provide opportunities (e.g., increase in productivity, disaster recovery, and environmentally friendly solutions) and challenges for businesses.
One of the main challenges would be to accurately classify numerous scanned documents uploaded every day by customers.
There are not many studies available to address the challenge as an application of image classification.
The proposed approach has been implemented using Python and experimentally validated on synthetic and real-world datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digitization, i.e., the process of converting information into a digital
format, may provide various opportunities (e.g., increase in productivity,
disaster recovery, and environmentally friendly solutions) and challenges for
businesses. In this context, one of the main challenges would be to accurately
classify numerous scanned documents uploaded every day by customers as usual
business processes. For example, processes in banking (e.g., applying for
loans) or the Government Registry of BDM (Births, Deaths, and Marriages)
applications may involve uploading several documents such as a driver's license
and passport. There are not many studies available to address the challenge as
an application of image classification. Although some studies are available
which used various methods, a more accurate model is still required. The
current study has proposed a robust fusion model to define the type of identity
documents accurately. The proposed approach is based on two different methods
in which images are classified based on their visual features and text
features. A novel model based on statistics and regression has been proposed to
calculate the confidence level for the feature-based classifier. A fuzzy-mean
fusion model has been proposed to combine the classifier results based on their
confidence score. The proposed approach has been implemented using Python and
experimentally validated on synthetic and real-world datasets. The performance
of the proposed model is evaluated using the Receiver Operating Characteristic
(ROC) curve analysis.
Related papers
- LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content [62.816876067499415]
We propose LiveXiv: a scalable evolving live benchmark based on scientific ArXiv papers.
LiveXiv accesses domain-specific manuscripts at any given timestamp and proposes to automatically generate visual question-answer pairs.
We benchmark multiple open and proprietary Large Multi-modal Models (LMMs) on the first version of our benchmark, showing its challenging nature and exposing the models true abilities.
arXiv Detail & Related papers (2024-10-14T17:51:23Z) - SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift.
We propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment.
Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Resources for Brewing BEIR: Reproducible Reference Models and an
Official Leaderboard [47.73060223236792]
BEIR is a benchmark dataset for evaluation of information retrieval models across 18 different domain/task combinations.
Our work addresses two shortcomings that prevent the benchmark from achieving its full potential.
arXiv Detail & Related papers (2023-06-13T00:26:18Z) - GVdoc: Graph-based Visual Document Classification [17.350393956461783]
We propose GVdoc, a graph-based document classification model.
Our approach generates a document graph based on its layout, and then trains a graph neural network to learn node and graph embeddings.
We show that our model, even with fewer parameters, outperforms state-of-the-art models on out-of-distribution data.
arXiv Detail & Related papers (2023-05-26T19:23:20Z) - A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search.
We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z) - Logically at the Factify 2022: Multimodal Fact Verification [2.8914815569249823]
This paper describes our participant system for the multi-modal fact verification (Factify) challenge at AAAI 2022.
Two baseline approaches are proposed and explored including an ensemble model and a multi-modal attention network.
Our best model is ranked first in leaderboard which obtains a weighted average F-measure of 0.77 on both validation and test set.
arXiv Detail & Related papers (2021-12-16T23:34:07Z) - Incorporating Vision Bias into Click Models for Image-oriented Search
Engine [51.192784793764176]
In this paper, we assume that vision bias exists in an image-oriented search engine as another crucial factor affecting the examination probability aside from position.
We use regression-based EM algorithm to predict the vision bias given the visual features extracted from candidate documents.
arXiv Detail & Related papers (2021-01-07T10:01:31Z) - DGSAC: Density Guided Sampling and Consensus [4.808421423598809]
Kernel Residual Density is a key differentiator between inliers and outliers.
We propose two model selection algorithms, an optimal quadratic program based, and a greedy.
We evaluate our method on a wide variety of tasks like planar segmentation, motion segmentation, vanishing point estimation, plane fitting to 3D point cloud, line, and circle fitting.
arXiv Detail & Related papers (2020-06-03T17:42:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.