Document Understanding Dataset and Evaluation (DUDE)
- URL: http://arxiv.org/abs/2305.08455v3
- Date: Mon, 11 Sep 2023 10:36:41 GMT
- Title: Document Understanding Dataset and Evaluation (DUDE)
- Authors: Jordy Van Landeghem, Rub\'en Tito, {\L}ukasz Borchmann, Micha{\l}
Pietruszka, Pawe{\l} J\'oziak, Rafa{\l} Powalski, Dawid Jurkiewicz, Micka\"el
Coustaty, Bertrand Ackaert, Ernest Valveny, Matthew Blaschko, Sien Moens,
Tomasz Stanis{\l}awek
- Abstract summary: Document Understanding dataset and evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs)
We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins, and dates.
- Score: 29.78902147806488
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We call on the Document AI (DocAI) community to reevaluate current
methodologies and embrace the challenge of creating more practically-oriented
benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to
remediate the halted research progress in understanding visually-rich documents
(VRDs). We present a new dataset with novelties related to types of questions,
answers, and document layouts based on multi-industry, multi-domain, and
multi-page VRDs of various origins, and dates. Moreover, we are pushing the
boundaries of current methods by creating multi-task and multi-domain
evaluation setups that more accurately simulate real-world situations where
powerful generalization and adaptation under low-resource settings are desired.
DUDE aims to set a new standard as a more practical, long-standing benchmark
for the community, and we hope that it will lead to future extensions and
contributions that address real-world challenges. Finally, our work illustrates
the importance of finding more efficient ways to model language, images, and
layout in DocAI.
Related papers
- GraphRevisedIE: Multimodal Information Extraction with Graph-Revised Network [3.9472311338123287]
Key information extraction from visually rich documents (VRD) has been a challenging task in document intelligence.
We propose a light-weight model named GraphIE that effectively embeds multimodal features such as textual, visual, and layout features from VRD.
Extensive experiments on multiple real-world datasets show that GraphIEReviseds generalizes to documents of varied layouts and achieves comparable or better performance compared to previous KIE methods.
arXiv Detail & Related papers (2024-10-02T01:29:49Z) - MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking [0.283600654802951]
We present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal datasets.
We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths.
Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset.
arXiv Detail & Related papers (2024-07-18T01:33:20Z) - CoIR: A Comprehensive Benchmark for Code Information Retrieval Models [56.691926887209895]
We present textbfname (textbfInformation textbfRetrieval Benchmark), a robust and comprehensive benchmark specifically designed to assess code retrieval capabilities.
name comprises textbften meticulously curated code datasets, spanning textbfeight distinctive retrieval tasks across textbfseven diverse domains.
We evaluate nine widely used retrieval models using name, uncovering significant difficulties in performing code retrieval tasks even with state-of-the-art systems.
arXiv Detail & Related papers (2024-07-03T07:58:20Z) - R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models [51.468732121824125]
Large language models have achieved remarkable success on general NLP tasks, but they may fall short for domain-specific problems.
Existing evaluation tools only provide a few baselines and evaluate them on various domains without mining the depth of domain knowledge.
In this paper, we address the challenges of evaluating RALLMs by introducing the R-Eval toolkit, a Python toolkit designed to streamline the evaluation of different RAGs.
arXiv Detail & Related papers (2024-06-17T15:59:49Z) - On Task-personalized Multimodal Few-shot Learning for Visually-rich
Document Entity Retrieval [59.25292920967197]
Few-shot document entity retrieval (VDER) is an important topic in industrial NLP applications.
FewVEX is a new dataset to boost future research in the field of entity-level few-shot VDER.
We present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization.
arXiv Detail & Related papers (2023-11-01T17:51:43Z) - Peek Across: Improving Multi-Document Modeling via Cross-Document
Question-Answering [49.85790367128085]
We pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.
This novel multi-document QA formulation directs the model to better recover cross-text informational relations.
Unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation and long text generation.
arXiv Detail & Related papers (2023-05-24T17:48:40Z) - PDSum: Prototype-driven Continuous Summarization of Evolving
Multi-document Sets Stream [33.68263291948121]
We propose a new summarization problem, Evolving Multi-Document sets stream Summarization (EMDS)
We introduce a novel unsupervised algorithm PDSum with the idea of prototype-driven continuous summarization.
PDSum builds a lightweight prototype of each multi-document set and exploits it to adapt to new documents.
arXiv Detail & Related papers (2023-02-10T23:43:46Z) - WSL-DS: Weakly Supervised Learning with Distant Supervision for Query
Focused Multi-Document Abstractive Summarization [16.048329028104643]
In the Query Focused Multi-Document Summarization (QF-MDS) task, a set of documents and a query are given where the goal is to generate a summary from these documents.
One major challenge for this task is the lack of availability of labeled training datasets.
We propose a novel weakly supervised learning approach via utilizing distant supervision.
arXiv Detail & Related papers (2020-11-03T02:02:55Z) - SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy [92.5683788430012]
SupMMD is a novel technique for generic and update summarization based on the maximum discrepancy from kernel two-sample testing.
We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
arXiv Detail & Related papers (2020-10-06T09:26:55Z) - SciREX: A Challenge Dataset for Document-Level Information Extraction [56.83748634747753]
It is challenging to create a large-scale information extraction dataset at the document level.
We introduce SciREX, a document level IE dataset that encompasses multiple IE tasks.
We develop a neural model as a strong baseline that extends previous state-of-the-art IE models to document-level IE.
arXiv Detail & Related papers (2020-05-01T17:30:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.