Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation
Extraction
- URL: http://arxiv.org/abs/2009.07503v2
- Date: Tue, 6 Oct 2020 08:56:20 GMT
- Title: Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation
Extraction
- Authors: Ranran Haoran Zhang, Qianying Liu, Aysa Xuemo Fan, Heng Ji, Daojian
Zeng, Fei Cheng, Daisuke Kawahara and Sadao Kurohashi
- Abstract summary: Joint entity and relation extraction aims to extract relation triplets from plain text directly.
We propose a novel Sequence-to-Unordered-Multi-Tree (Seq2UMTree) model to minimize the effects of exposure bias.
- Score: 57.22929457171352
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Joint entity and relation extraction aims to extract relation triplets from
plain text directly. Prior work leverages Sequence-to-Sequence (Seq2Seq) models
for triplet sequence generation. However, Seq2Seq enforces an unnecessary order
on the unordered triplets and involves a large decoding length associated with
error accumulation. These introduce exposure bias, which may cause the models
overfit to the frequent label combination, thus deteriorating the
generalization. We propose a novel Sequence-to-Unordered-Multi-Tree
(Seq2UMTree) model to minimize the effects of exposure bias by limiting the
decoding length to three within a triplet and removing the order among
triplets. We evaluate our model on two datasets, DuIE and NYT, and
systematically study how exposure bias alters the performance of Seq2Seq
models. Experiments show that the state-of-the-art Seq2Seq model overfits to
both datasets while Seq2UMTree shows significantly better generalization. Our
code is available at https://github.com/WindChimeRan/OpenJERE .
Related papers
- S2F-NER: Exploring Sequence-to-Forest Generation for Complex Entity
Recognition [47.714230389689064]
We propose a novel Sequence-to-Forest generation paradigm, S2F-NER, which can directly extract entities in sentence via a Forest decoder.
Specifically, our model generate each path of each tree in forest autoregressively, where the maximum depth of each tree is three.
Based on this novel paradigm, our model can elegantly mitigate the exposure bias problem and keep the simplicity of Seq2Seq.
arXiv Detail & Related papers (2023-10-29T09:09:10Z) - Joint Entity and Relation Extraction with Span Pruning and Hypergraph
Neural Networks [58.43972540643903]
We propose HyperGraph neural network for ERE ($hgnn$), which is built upon the PL-marker (a state-of-the-art marker-based pipleline model)
To alleviate error propagation,we use a high-recall pruner mechanism to transfer the burden of entity identification and labeling from the NER module to the joint module of our model.
Experiments on three widely used benchmarks for ERE task show significant improvements over the previous state-of-the-art PL-marker.
arXiv Detail & Related papers (2023-10-26T08:36:39Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Conditional set generation using Seq2seq models [52.516563721766445]
Conditional set generation learns a mapping from an input sequence of tokens to a set.
Sequence-to-sequence(Seq2seq) models are a popular choice to model set generation.
We propose a novel algorithm for effectively sampling informative orders over the space of label orders.
arXiv Detail & Related papers (2022-05-25T04:17:50Z) - Joint Entity and Relation Extraction with Set Prediction Networks [24.01964730210045]
We treat joint entity and relation extraction as a direct set prediction problem.
Unlike autoregressive approaches that generate triples one by one in a certain order, the proposed networks directly output the final set of triples in one shot.
Experiments on two benchmark datasets show that our proposed model significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-03T13:04:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.