Recurrent Few-Shot model for Document Verification
- URL: http://arxiv.org/abs/2410.02456v1
- Date: Thu, 3 Oct 2024 13:05:27 GMT
- Title: Recurrent Few-Shot model for Document Verification
- Authors: Maxime Talarmain, Carlos Boned, Sanket Biswas, Oriol Ramos,
- Abstract summary: General-purpose ID, or travel, document image- and video-based verification systems have yet to achieve good enough performance to be considered a solved problem.
We propose a recurrent-based model able to detect forged documents in a few-shot scenario.
Preliminary results on the SIDTD and Findit datasets show good performance of this model for this task.
- Score: 1.9686770963118383
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: General-purpose ID, or travel, document image- and video-based verification systems have yet to achieve good enough performance to be considered a solved problem. There are several factors that negatively impact their performance, including low-resolution images and videos and a lack of sufficient data to train the models. This task is particularly challenging when dealing with unseen class of ID, or travel, documents. In this paper we address this task by proposing a recurrent-based model able to detect forged documents in a few-shot scenario. The recurrent architecture makes the model robust to document resolution variability. Moreover, the few-shot approach allow the model to perform well even for unseen class of documents. Preliminary results on the SIDTD and Findit datasets show good performance of this model for this task.
Related papers
- DocMIA: Document-Level Membership Inference Attacks against DocVQA Models [52.13818827581981]
We introduce two novel membership inference attacks tailored specifically to DocVQA models.
Our methods outperform existing state-of-the-art membership inference attacks across a variety of DocVQA models and datasets.
arXiv Detail & Related papers (2025-02-06T00:58:21Z) - Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models [0.2517406173566782]
Classifying scanned documents is a challenging problem that involves image, layout, and text analysis for document understanding.
For certain benchmark datasets, notably RVL-CDIP, the state of the art is closing in to near-perfect performance.
arXiv Detail & Related papers (2024-12-18T13:53:16Z) - You Only Submit One Image to Find the Most Suitable Generative Model [48.67303250592189]
We propose a novel setting called Generative Model Identification (GMI)
GMI aims to enable the user to identify the most appropriate generative model(s) for the user's requirements efficiently.
arXiv Detail & Related papers (2024-12-16T14:46:57Z) - Binarizing Documents by Leveraging both Space and Frequency [33.334956022229846]
Document Image Binarization is a well-known problem in Document Analysis and Computer Vision.
We propose an alternative solution based on the recently introduced Fast Fourier Convolutions.
arXiv Detail & Related papers (2024-04-26T08:31:10Z) - Identifying and Mitigating Model Failures through Few-shot CLIP-aided
Diffusion Generation [65.268245109828]
We propose an end-to-end framework to generate text descriptions of failure modes associated with spurious correlations.
These descriptions can be used to generate synthetic data using generative models, such as diffusion models.
Our experiments have shown remarkable textbfimprovements in accuracy ($sim textbf21%$) on hard sub-populations.
arXiv Detail & Related papers (2023-12-09T04:43:49Z) - On Task-personalized Multimodal Few-shot Learning for Visually-rich
Document Entity Retrieval [59.25292920967197]
Few-shot document entity retrieval (VDER) is an important topic in industrial NLP applications.
FewVEX is a new dataset to boost future research in the field of entity-level few-shot VDER.
We present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization.
arXiv Detail & Related papers (2023-11-01T17:51:43Z) - IncDSI: Incrementally Updatable Document Retrieval [35.5697863674097]
IncDSI is a method to add documents in real time without retraining the model on the entire dataset.
We formulate the addition of documents as a constrained optimization problem that makes minimal changes to the network parameters.
Our approach is competitive with re-training the model on the whole dataset.
arXiv Detail & Related papers (2023-07-19T07:20:30Z) - Earning Extra Performance from Restrictive Feedbacks [41.05874087063763]
We set up a challenge named emphEarning eXtra PerformancE from restriCTive feEDdbacks (EXPECTED) to describe this form of model tuning problems.
The goal of the model provider is to eventually deliver a satisfactory model to the local user(s) by utilizing the feedbacks.
We propose to characterize the geometry of the model performance with regard to model parameters through exploring the parameters' distribution.
arXiv Detail & Related papers (2023-04-28T13:16:54Z) - Augmenting Document Representations for Dense Retrieval with
Interpolation and Perturbation [49.940525611640346]
Document Augmentation for dense Retrieval (DAR) framework augments the representations of documents with their Dense Augmentation and perturbations.
We validate the performance of DAR on retrieval tasks with two benchmark datasets, showing that the proposed DAR significantly outperforms relevant baselines on the dense retrieval of both the labeled and unlabeled documents.
arXiv Detail & Related papers (2022-03-15T09:07:38Z) - DocSynth: A Layout Guided Approach for Controllable Document Image
Synthesis [16.284895792639137]
This paper presents a novel approach, called Doc Synth, to automatically synthesize document images based on a given layout.
In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed Doc Synth model learns to generate a set of realistic document images.
The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
arXiv Detail & Related papers (2021-07-06T14:24:30Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.