OPAD: An Optimized Policy-based Active Learning Framework for Document
Content Analysis
- URL: http://arxiv.org/abs/2110.02069v1
- Date: Fri, 1 Oct 2021 07:40:56 GMT
- Title: OPAD: An Optimized Policy-based Active Learning Framework for Document
Content Analysis
- Authors: Sumit Shekhar, Bhanu Prakash Reddy Guda, Ashutosh Chaubey, Ishan
Jindal, Avanish Jain
- Abstract summary: We propose textitOPAD, a novel framework using reinforcement policy for active learning in content detection tasks for documents.
The framework learns the acquisition function to decide the samples to be selected while optimizing performance metrics.
We show superior performance of the proposed textitOPAD framework for active learning for various tasks related to document understanding.
- Score: 6.159771892460152
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Documents are central to many business systems, and include forms, reports,
contracts, invoices or purchase orders. The information in documents is
typically in natural language, but can be organized in various layouts and
formats. There have been recent spurt of interest in understanding document
content with novel deep learning architectures. However, document understanding
tasks need dense information annotations, which are costly to scale and
generalize. Several active learning techniques have been proposed to reduce the
overall budget of annotation while maintaining the performance of the
underlying deep learning model. However, most of these techniques work only for
classification problems. But content detection is a more complex task, and has
been scarcely explored in active learning literature. In this paper, we propose
\textit{OPAD}, a novel framework using reinforcement policy for active learning
in content detection tasks for documents. The proposed framework learns the
acquisition function to decide the samples to be selected while optimizing
performance metrics that the tasks typically have. Furthermore, we extend to
weak labelling scenarios to further reduce the cost of annotation
significantly. We propose novel rewards to account for class imbalance and user
feedback in the annotation interface, to improve the active learning method. We
show superior performance of the proposed \textit{OPAD} framework for active
learning for various tasks related to document understanding like layout
parsing, object detection and named entity recognition. Ablation studies for
human feedback and class imbalance rewards are presented, along with a
comparison of annotation times for different approaches.
Related papers
- On Task-personalized Multimodal Few-shot Learning for Visually-rich
Document Entity Retrieval [59.25292920967197]
Few-shot document entity retrieval (VDER) is an important topic in industrial NLP applications.
FewVEX is a new dataset to boost future research in the field of entity-level few-shot VDER.
We present a task-aware meta-learning based framework, with a central focus on achieving effective task personalization.
arXiv Detail & Related papers (2023-11-01T17:51:43Z) - Information Extraction from Documents: Question Answering vs Token
Classification in real-world setups [0.0]
We compare the Question Answering approach with the classical token classification approach for document key information extraction.
Our research showed that when dealing with clean and relatively short entities, it is still best to use token classification-based approach.
arXiv Detail & Related papers (2023-04-21T14:43:42Z) - Document Provenance and Authentication through Authorship Classification [5.2545206693029884]
We propose an ensemble-based text-processing framework for the classification of single and multi-authored documents.
The proposed framework incorporates several state-of-the-art text classification algorithms.
The framework is evaluated on a large-scale benchmark dataset.
arXiv Detail & Related papers (2023-03-02T12:26:03Z) - Active Learning for Abstractive Text Summarization [50.79416783266641]
We propose the first effective query strategy for Active Learning in abstractive text summarization.
We show that using our strategy in AL annotation helps to improve the model performance in terms of ROUGE and consistency scores.
arXiv Detail & Related papers (2023-01-09T10:33:14Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - Unified Pretraining Framework for Document Understanding [52.224359498792836]
We present UDoc, a new unified pretraining framework for document understanding.
UDoc is designed to support most document understanding tasks, extending the Transformer to take multimodal embeddings as input.
An important feature of UDoc is that it learns a generic representation by making use of three self-supervised losses.
arXiv Detail & Related papers (2022-04-22T21:47:04Z) - Assisted Text Annotation Using Active Learning to Achieve High Quality
with Little Effort [9.379650501033465]
We propose a tool that enables researchers to create large, high-quality, annotated datasets with only a few manual annotations.
We combine an active learning (AL) approach with a pre-trained language model to semi-automatically identify annotation categories.
Our preliminary results show that employing AL strongly reduces the number of annotations for correct classification of even complex and subtle frames.
arXiv Detail & Related papers (2021-12-15T13:14:58Z) - Knowledge-Aware Meta-learning for Low-Resource Text Classification [87.89624590579903]
This paper studies a low-resource text classification problem and bridges the gap between meta-training and meta-testing tasks.
We propose KGML to introduce additional representation for each sentence learned from the extracted sentence-specific knowledge graph.
arXiv Detail & Related papers (2021-09-10T07:20:43Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - OLALA: Object-Level Active Learning for Efficient Document Layout
Annotation [24.453873808984415]
We propose an Object-Level Active Learning framework for efficient document layout.
In this framework, only regions with the most ambiguous object predictions within an image are selected for annotators to label.
For unselected predictions, the semi-automatic correction algorithm is proposed to identify certain errors based on prior knowledge of layout structures.
arXiv Detail & Related papers (2020-10-05T03:48:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.