TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding
Tag/Word Relations and More Fine-Grained Tags
- URL: http://arxiv.org/abs/2211.00684v1
- Date: Tue, 1 Nov 2022 18:17:49 GMT
- Title: TOE: A Grid-Tagging Discontinuous NER Model Enhanced by Embedding
Tag/Word Relations and More Fine-Grained Tags
- Authors: Jiang Liu, Donghong Ji, Jingye Li, Dongdong Xie, Chong Teng, Liang
Zhao and Fei Li
- Abstract summary: We propose a competitive grid-tagging model for discontinuous named entity recognition.
We incorporate two kinds of Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging model.
Our model pushes the SOTA results by about 0.83%, 0.05% and 0.66% in F1, demonstrating its effectiveness.
- Score: 32.95446649391798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: So far, discontinuous named entity recognition (NER) has received increasing
research attention and many related methods have surged such as
hypergraph-based methods, span-based methods, and sequence-to-sequence
(Seq2Seq) methods, etc. However, these methods more or less suffer from some
problems such as decoding ambiguity and efficiency, which limit their
performance. Recently, grid-tagging methods, which benefit from the flexible
design of tagging systems and model architectures, have shown superiority to
adapt for various information extraction tasks. In this paper, we follow the
line of such methods and propose a competitive grid-tagging model for
discontinuous NER. We call our model TOE because we incorporate two kinds of
Tag-Oriented Enhancement mechanisms into a state-of-the-art (SOTA) grid-tagging
model that casts the NER problem into word-word relationship prediction. First,
we design a Tag Representation Embedding Module (TREM) to force our model to
consider not only word-word relationships but also word-tag and tag-tag
relationships. Concretely, we construct tag representations and embed them into
TREM, so that TREM can treat tag and word representations as
queries/keys/values and utilize self-attention to model their relationships. On
the other hand, motivated by the Next-Neighboring-Word (NNW) and Tail-Head-Word
(THW) tags in the SOTA model, we add two new symmetric tags, namely
Previous-Neighboring-Word (PNW) and Head-Tail-Word (HTW), to model more
fine-grained word-word relationships and alleviate error propagation from tag
prediction. In the experiments of three benchmark datasets, namely CADEC,
ShARe13 and ShARe14, our TOE model pushes the SOTA results by about 0.83%,
0.05% and 0.66% in F1, demonstrating its effectiveness.
Related papers
- FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - 3D-STMN: Dependency-Driven Superpoint-Text Matching Network for
End-to-End 3D Referring Expression Segmentation [33.20461146674787]
In 3D Referring Expression (3D-RES), the earlier approach adopts a two-stage paradigm, extracting segmentation proposals and then matching them with referring expressions.
We introduce an innovative end-to-end Superpoint-Text Matching Network (3D-STMN) that is enriched by dependency-driven insights.
Our model not only set new performance standards, registering an mIoU gain of 11.7 points but also achieve a staggering enhancement in inference speed, surpassing traditional methods by 95.7 times.
arXiv Detail & Related papers (2023-08-31T11:00:03Z) - Contextual Dictionary Lookup for Knowledge Graph Completion [32.493168863565465]
Knowledge graph completion (KGC) aims to solve the incompleteness of knowledge graphs (KGs) by predicting missing links from known triples.
Most existing embedding models map each relation into a unique vector, overlooking the specific fine-grained semantics of them under different entities.
We present a novel method utilizing contextual dictionary lookup, enabling conventional embedding models to learn fine-grained semantics of relations in an end-to-end manner.
arXiv Detail & Related papers (2023-06-13T12:13:41Z) - Prototype-based Embedding Network for Scene Graph Generation [105.97836135784794]
Current Scene Graph Generation (SGG) methods explore contextual information to predict relationships among entity pairs.
Due to the diverse visual appearance of numerous possible subject-object combinations, there is a large intra-class variation within each predicate category.
Prototype-based Embedding Network (PE-Net) models entities/predicates with prototype-aligned compact and distinctive representations.
PL is introduced to help PE-Net efficiently learn such entitypredicate matching, and Prototype Regularization (PR) is devised to relieve the ambiguous entity-predicate matching.
arXiv Detail & Related papers (2023-03-13T13:30:59Z) - Joint Multimodal Entity-Relation Extraction Based on Edge-enhanced Graph
Alignment Network and Word-pair Relation Tagging [19.872199943795195]
This paper is the first to propose jointly performing MNER and MRE as a joint multimodal entity-relation extraction task.
The proposed method can leverage the edge information to auxiliary alignment between objects and entities.
arXiv Detail & Related papers (2022-11-28T03:23:54Z) - Ground Truth Inference for Weakly Supervised Entity Matching [76.6732856489872]
We propose a simple but powerful labeling model for weak supervision tasks.
We then tailor the labeling model specifically to the task of entity matching.
We show that our labeling model results in a 9% higher F1 score on average than the best existing method.
arXiv Detail & Related papers (2022-11-13T17:57:07Z) - SpanProto: A Two-stage Span-based Prototypical Network for Few-shot
Named Entity Recognition [45.012327072558975]
Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data.
We propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach.
In the span extraction stage, we transform the sequential tags into a global boundary matrix, enabling the model to focus on the explicit boundary information.
For mention classification, we leverage prototypical learning to capture the semantic representations for each labeled span and make the model better adapt to novel-class entities.
arXiv Detail & Related papers (2022-10-17T12:59:33Z) - Unified Named Entity Recognition as Word-Word Relation Classification [25.801945832005504]
We present a novel alternative by modeling the unified NER as word-word relation classification, namely W2NER.
The architecture resolves the kernel bottleneck of unified NER by effectively modeling the neighboring relations between entity words.
Based on the W2NER scheme we develop a neural framework, in which the unified NER is modeled as a 2D grid of word pairs.
arXiv Detail & Related papers (2021-12-19T06:11:07Z) - Pack Together: Entity and Relation Extraction with Levitated Marker [61.232174424421025]
We propose a novel span representation approach, named Packed Levitated Markers, to consider the dependencies between the spans (pairs) by strategically packing the markers in the encoder.
Our experiments show that our model with packed levitated markers outperforms the sequence labeling model by 0.4%-1.9% F1 on three flat NER tasks, and beats the token concat model on six NER benchmarks.
arXiv Detail & Related papers (2021-09-13T15:38:13Z) - SChME at SemEval-2020 Task 1: A Model Ensemble for Detecting Lexical
Semantic Change [58.87961226278285]
This paper describes SChME, a method used in SemEval-2020 Task 1 on unsupervised detection of lexical semantic change.
SChME usesa model ensemble combining signals of distributional models (word embeddings) and wordfrequency models where each model casts a vote indicating the probability that a word sufferedsemantic change according to that feature.
arXiv Detail & Related papers (2020-12-02T23:56:34Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.