Related papers: Recurrent Few-Shot model for Document Verification

Related papers

Model Editing for New Document Integration in Generative Information Retrieval [110.90609826290968]
Generative retrieval (GR) reformulates the Information Retrieval (IR) task as the generation of document identifiers (docIDs)<n>Existing GR models exhibit poor generalization to newly added documents, often failing to generate the correct docIDs.<n>We propose DOME, a novel method that effectively and efficiently adapts GR models to unseen documents.
arXiv Detail & Related papers (2026-03-03T09:13:38Z)
Resolving Evidence Sparsity: Agentic Context Engineering for Long-Document Understanding [49.26132236798123]
Vision Language Models (VLMs) have gradually become a primary approach in document understanding.<n>We propose SLEUTH, a multi agent framework that orchestrates a retriever and four collaborative agents in a coarse to fine process.<n>The framework identifies key textual and visual clues within the retrieved pages, filters for salient visual evidence such as tables and charts, and analyzes the query to devise a reasoning strategy.
arXiv Detail & Related papers (2025-11-28T03:09:40Z)
Improving Document Retrieval Coherence for Semantically Equivalent Queries [63.97649988164166]
We propose a variation of the Multi-Negative Ranking loss for training DR that improves the coherence of models in retrieving the same documents.<n>The loss penalizes discrepancies between the top-k ranked documents retrieved for diverse but semantic equivalent queries.
arXiv Detail & Related papers (2025-08-11T13:34:59Z)
DREAM: Document Reconstruction via End-to-end Autoregressive Model [53.51754520966657]
We present an innovative autoregressive model specifically designed for document reconstruction, referred to as Document Reconstruction via End-to-end Autoregressive Model (DREAM)<n>We establish a standardized definition of the document reconstruction task, and introduce a novel Document Similarity Metric (DSM) and DocRec1K dataset for assessing the performance of the task.
arXiv Detail & Related papers (2025-07-08T09:24:07Z)
DocMEdit: Towards Document-Level Model Editing [38.97953188421146]
We introduce benchmarkname, a dataset focused on document-level model editing.<n>Results show that the difficulties in document-level model editing pose challenges for existing model editing methods.
arXiv Detail & Related papers (2025-05-26T06:37:24Z)
DocMIA: Document-Level Membership Inference Attacks against DocVQA Models [52.13818827581981]
We introduce two novel membership inference attacks tailored specifically to DocVQA models. Our methods outperform existing state-of-the-art membership inference attacks across a variety of DocVQA models and datasets.
arXiv Detail & Related papers (2025-02-06T00:58:21Z)
Zero-Shot Prompting and Few-Shot Fine-Tuning: Revisiting Document Image Classification Using Large Language Models [0.2517406173566782]
Classifying scanned documents is a challenging problem that involves image, layout, and text analysis for document understanding. For certain benchmark datasets, notably RVL-CDIP, the state of the art is closing in to near-perfect performance.
arXiv Detail & Related papers (2024-12-18T13:53:16Z)
You Only Submit One Image to Find the Most Suitable Generative Model [48.67303250592189]
We propose a novel setting called Generative Model Identification (GMI) GMI aims to enable the user to identify the most appropriate generative model(s) for the user's requirements efficiently.
arXiv Detail & Related papers (2024-12-16T14:46:57Z)
Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. We identify model weaknesses by testing the model using the counterfactual image dataset. We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z)
Binarizing Documents by Leveraging both Space and Frequency [33.334956022229846]
Document Image Binarization is a well-known problem in Document Analysis and Computer Vision. We propose an alternative solution based on the recently introduced Fast Fourier Convolutions.
arXiv Detail & Related papers (2024-04-26T08:31:10Z)
Identifying and Mitigating Model Failures through Few-shot CLIP-aided Diffusion Generation [65.268245109828]
We propose an end-to-end framework to generate text descriptions of failure modes associated with spurious correlations. These descriptions can be used to generate synthetic data using generative models, such as diffusion models. Our experiments have shown remarkable textbfimprovements in accuracy ($sim textbf21%$) on hard sub-populations.
arXiv Detail & Related papers (2023-12-09T04:43:49Z)
On Task-personalized Multimodal Few-shot Learning for Visually-rich Document Entity Retrieval [59.25292920967197]
Few-shot document entity retrieval (VDER) is an important topic in industrial NLP applications. FewVEX is a new dataset to boost future research in the field of entity-level few-shot VDER. We present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization.
arXiv Detail & Related papers (2023-11-01T17:51:43Z)
IncDSI: Incrementally Updatable Document Retrieval [35.5697863674097]
IncDSI is a method to add documents in real time without retraining the model on the entire dataset. We formulate the addition of documents as a constrained optimization problem that makes minimal changes to the network parameters. Our approach is competitive with re-training the model on the whole dataset.
arXiv Detail & Related papers (2023-07-19T07:20:30Z)
Earning Extra Performance from Restrictive Feedbacks [41.05874087063763]
We set up a challenge named emphEarning eXtra PerformancE from restriCTive feEDdbacks (EXPECTED) to describe this form of model tuning problems. The goal of the model provider is to eventually deliver a satisfactory model to the local user(s) by utilizing the feedbacks. We propose to characterize the geometry of the model performance with regard to model parameters through exploring the parameters' distribution.
arXiv Detail & Related papers (2023-04-28T13:16:54Z)
Augmenting Document Representations for Dense Retrieval with Interpolation and Perturbation [49.940525611640346]
Document Augmentation for dense Retrieval (DAR) framework augments the representations of documents with their Dense Augmentation and perturbations. We validate the performance of DAR on retrieval tasks with two benchmark datasets, showing that the proposed DAR significantly outperforms relevant baselines on the dense retrieval of both the labeled and unlabeled documents.
arXiv Detail & Related papers (2022-03-15T09:07:38Z)
One-shot Key Information Extraction from Document with Deep Partial Graph Matching [60.48651298832829]
Key Information Extraction (KIE) from documents improves efficiency, productivity, and security in many industrial scenarios. Existing supervised learning methods for the KIE task need to feed a large number of labeled samples and learn separate models for different types of documents. We propose a deep end-to-end trainable network for one-shot KIE using partial graph matching.
arXiv Detail & Related papers (2021-09-26T07:45:53Z)
DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis [16.284895792639137]
This paper presents a novel approach, called Doc Synth, to automatically synthesize document images based on a given layout. In this work, given a spatial layout (bounding boxes with object categories) as a reference by the user, our proposed Doc Synth model learns to generate a set of realistic document images. The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
arXiv Detail & Related papers (2021-07-06T14:24:30Z)
An Intelligent Hybrid Model for Identity Document Classification [0.0]
Digitization may provide opportunities (e.g., increase in productivity, disaster recovery, and environmentally friendly solutions) and challenges for businesses. One of the main challenges would be to accurately classify numerous scanned documents uploaded every day by customers. There are not many studies available to address the challenge as an application of image classification. The proposed approach has been implemented using Python and experimentally validated on synthetic and real-world datasets.
arXiv Detail & Related papers (2021-06-07T13:08:00Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.