ICDAR 2021 Competition on Components Segmentation Task of Document
Photos
- URL: http://arxiv.org/abs/2106.08499v1
- Date: Wed, 16 Jun 2021 00:49:58 GMT
- Title: ICDAR 2021 Competition on Components Segmentation Task of Document
Photos
- Authors: Celso A. M. Lopes Junior, Ricardo B. das Neves Junior, Byron L. D.
Bezerra, Alejandro H. Toselli, Donato Impedovo
- Abstract summary: Three challenge tasks were proposed entailing different segmentation assignments to be performed on a provided dataset.
The collected data are from several types of Brazilian ID documents, whose personal information was conveniently replaced.
Different Deep Learning models were applied by the entrants with diverse strategies to achieve the best results in each of the tasks.
- Score: 63.289361617237944
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper describes the short-term competition on Components Segmentation
Task of Document Photos that was prepared in the context of the 16th
International Conference on Document Analysis and Recognition (ICDAR 2021).
This competition aims to bring together researchers working on the filed of
identification document image processing and provides them a suitable benchmark
to compare their techniques on the component segmentation task of document
images. Three challenge tasks were proposed entailing different segmentation
assignments to be performed on a provided dataset. The collected data are from
several types of Brazilian ID documents, whose personal information was
conveniently replaced. There were 16 participants whose results obtained for
some or all the three tasks show different rates for the adopted metrics, like
Dice Similarity Coefficient ranging from 0.06 to 0.99. Different Deep Learning
models were applied by the entrants with diverse strategies to achieve the best
results in each of the tasks. Obtained results show that the current applied
methods for solving one of the proposed tasks (document boundary detection) are
already well stablished. However, for the other two challenge tasks (text zone
and handwritten sign detection) research and development of more robust
approaches are still required to achieve acceptable results.
Related papers
- Unified Multi-Modal Interleaved Document Representation for Information Retrieval [57.65409208879344]
We produce more comprehensive and nuanced document representations by holistically embedding documents interleaved with different modalities.
Specifically, we achieve this by leveraging the capability of recent vision-language models that enable the processing and integration of text, images, and tables into a unified format and representation.
arXiv Detail & Related papers (2024-10-03T17:49:09Z) - On Task-personalized Multimodal Few-shot Learning for Visually-rich
Document Entity Retrieval [59.25292920967197]
Few-shot document entity retrieval (VDER) is an important topic in industrial NLP applications.
FewVEX is a new dataset to boost future research in the field of entity-level few-shot VDER.
We present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization.
arXiv Detail & Related papers (2023-11-01T17:51:43Z) - Enhancing Document Information Analysis with Multi-Task Pre-training: A
Robust Approach for Information Extraction in Visually-Rich Documents [8.49076413640561]
The model is pre-trained and subsequently fine-tuned for various document image analysis tasks.
The proposed model achieved impressive results across all tasks, with an accuracy of 95.87% on the RVL-CDIP dataset for document classification.
arXiv Detail & Related papers (2023-10-25T10:22:30Z) - EFaR 2023: Efficient Face Recognition Competition [51.77649060180531]
The paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023)
The competition received 17 submissions from 6 different teams.
The submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size.
arXiv Detail & Related papers (2023-08-08T09:58:22Z) - ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich
Document Images [198.35937007558078]
The competition opened on 30th December, 2022 and closed on 24th March, 2023.
There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2.
According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios.
arXiv Detail & Related papers (2023-06-05T22:20:52Z) - AIMS: All-Inclusive Multi-Level Segmentation [93.5041381700744]
We propose a new task, All-Inclusive Multi-Level (AIMS), which segments visual regions into three levels: part, entity, and relation.
We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation.
arXiv Detail & Related papers (2023-05-28T16:28:49Z) - ICDAR 2023 Competition on Robust Layout Segmentation in Corporate
Documents [3.6700088931938835]
ICDAR has a long tradition in hosting competitions to benchmark the state-of-the-art.
To raise the bar over previous competitions, we engineered a hard competition dataset and proposed the recent DocLayNet dataset for training.
We recognize interesting combinations of recent computer vision models, data augmentation strategies and ensemble methods to achieve remarkable accuracy in the task we posed.
arXiv Detail & Related papers (2023-05-24T09:56:47Z) - A Survey of Historical Document Image Datasets [2.8707038627097226]
This paper presents a systematic literature review of image datasets for document image analysis.
It focuses on historical documents, such as handwritten manuscripts and early prints.
Finding appropriate datasets for historical document analysis is a crucial prerequisite to facilitate research using different machine learning algorithms.
arXiv Detail & Related papers (2022-03-16T09:56:48Z) - A Fast Fully Octave Convolutional Neural Network for Document Image
Segmentation [1.8426817621478804]
We investigate a method based on U-Net to detect the document edges and text regions in ID images.
We propose a model optimization based on Octave Convolutions to qualify the method to situations where storage, processing, and time resources are limited.
Our results showed that the proposed models are efficient to document segmentation tasks and portable.
arXiv Detail & Related papers (2020-04-03T00:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.