Towards Few-shot Entity Recognition in Document Images: A Label-aware
Sequence-to-Sequence Framework
- URL: http://arxiv.org/abs/2204.05819v1
- Date: Wed, 30 Mar 2022 18:30:42 GMT
- Title: Towards Few-shot Entity Recognition in Document Images: A Label-aware
Sequence-to-Sequence Framework
- Authors: Zilong Wang, Jingbo Shang
- Abstract summary: We build an entity recognition model requiring only a few shots of annotated document images.
We develop a novel label-aware seq2seq framework, LASER.
Experiments on two benchmark datasets demonstrate the superiority of LASER under the few-shot setting.
- Score: 28.898240725099782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Entity recognition is a fundamental task in understanding document images.
Traditional sequence labeling frameworks treat the entity types as class IDs
and rely on extensive data and high-quality annotations to learn semantics
which are typically expensive in practice. In this paper, we aim to build an
entity recognition model requiring only a few shots of annotated document
images. To overcome the data limitation, we propose to leverage the label
surface names to better inform the model of the target entity type semantics
and also embed the labels into the spatial embedding space to capture the
spatial correspondence between regions and labels. Specifically, we go beyond
sequence labeling and develop a novel label-aware seq2seq framework, LASER. The
proposed model follows a new labeling scheme that generates the label surface
names word-by-word explicitly after generating the entities. During training,
LASER refines the label semantics by updating the label surface name
representations and also strengthens the label-region correlation. In this way,
LASER recognizes the entities from document images through both semantic and
layout correspondence. Extensive experiments on two benchmark datasets
demonstrate the superiority of LASER under the few-shot setting.
Related papers
- Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Label Semantics for Few Shot Named Entity Recognition [68.01364012546402]
We study the problem of few shot learning for named entity recognition.
We leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors.
Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder.
arXiv Detail & Related papers (2022-03-16T23:21:05Z) - Semantic-Aware Representation Blending for Multi-Label Image Recognition
with Partial Labels [86.17081952197788]
We propose to blend category-specific representation across different images to transfer information of known labels to complement unknown labels.
Experiments on the MS-COCO, Visual Genome, Pascal VOC 2007 datasets show that the proposed SARB framework obtains superior performance over current leading competitors.
arXiv Detail & Related papers (2022-03-04T07:56:16Z) - A Label Dependence-aware Sequence Generation Model for Multi-level
Implicit Discourse Relation Recognition [31.179555215952306]
Implicit discourse relation recognition is a challenging but crucial task in discourse analysis.
We propose a Label Dependence-aware Sequence Generation Model (LDSGM) for it.
We develop a mutual learning enhanced training method to exploit the label dependence in a bottomup direction.
arXiv Detail & Related papers (2021-12-22T09:14:03Z) - Structured Semantic Transfer for Multi-Label Recognition with Partial
Labels [85.6967666661044]
We propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels.
The framework consists of two complementary transfer modules that explore within-image and cross-image semantic correlations.
Experiments on the Microsoft COCO, Visual Genome and Pascal VOC datasets show that the proposed SST framework obtains superior performance over current state-of-the-art algorithms.
arXiv Detail & Related papers (2021-12-21T02:15:01Z) - Few-shot Slot Tagging with Collapsed Dependency Transfer and
Label-enhanced Task-adaptive Projection Network [61.94394163309688]
We propose a Label-enhanced Task-Adaptive Projection Network (L-TapNet) based on the state-of-the-art few-shot classification model -- TapNet.
Experimental results show that our model significantly outperforms the strongest few-shot learning baseline by 14.64 F1 scores in the one-shot setting.
arXiv Detail & Related papers (2020-06-10T07:50:44Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.