Joint Entity and Relation Extraction with Set Prediction Networks
- URL: http://arxiv.org/abs/2011.01675v2
- Date: Thu, 5 Nov 2020 12:47:21 GMT
- Title: Joint Entity and Relation Extraction with Set Prediction Networks
- Authors: Dianbo Sui, Yubo Chen, Kang Liu, Jun Zhao, Xiangrong Zeng, Shengping
Liu
- Abstract summary: We treat joint entity and relation extraction as a direct set prediction problem.
Unlike autoregressive approaches that generate triples one by one in a certain order, the proposed networks directly output the final set of triples in one shot.
Experiments on two benchmark datasets show that our proposed model significantly outperforms current state-of-the-art methods.
- Score: 24.01964730210045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The joint entity and relation extraction task aims to extract all relational
triples from a sentence. In essence, the relational triples contained in a
sentence are unordered. However, previous seq2seq based models require to
convert the set of triples into a sequence in the training phase. To break this
bottleneck, we treat joint entity and relation extraction as a direct set
prediction problem, so that the extraction model can get rid of the burden of
predicting the order of multiple triples. To solve this set prediction problem,
we propose networks featured by transformers with non-autoregressive parallel
decoding. Unlike autoregressive approaches that generate triples one by one in
a certain order, the proposed networks directly output the final set of triples
in one shot. Furthermore, we also design a set-based loss that forces unique
predictions via bipartite matching. Compared with cross-entropy loss that
highly penalizes small shifts in triple order, the proposed bipartite matching
loss is invariant to any permutation of predictions; thus, it can provide the
proposed networks with a more accurate training signal by ignoring triple order
and focusing on relation types and entities. Experiments on two benchmark
datasets show that our proposed model significantly outperforms current
state-of-the-art methods. Training code and trained models will be available at
http://github.com/DianboWork/SPN4RE.
Related papers
- Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - JANA: Jointly Amortized Neural Approximation of Complex Bayesian Models [0.5872014229110214]
We propose jointly amortized neural approximation'' (JANA) of intractable likelihood functions and posterior densities.
We benchmark the fidelity of JANA on a variety of simulation models against state-of-the-art Bayesian methods.
arXiv Detail & Related papers (2023-02-17T20:17:21Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition [1.278093617645299]
Anomaly detection can be conceived either through generative modelling of regular training data or by discriminating with respect to negative training data.
This paper presents a novel hybrid anomaly score which allows dense open-set recognition on large natural images.
Experiments evaluate our contributions on standard dense anomaly detection benchmarks as well as in terms of open-mIoU - a novel metric for dense open-set performance.
arXiv Detail & Related papers (2022-07-06T11:48:50Z) - Deep Probabilistic Graph Matching [72.6690550634166]
We propose a deep learning-based graph matching framework that works for the original QAP without compromising on the matching constraints.
The proposed method is evaluated on three popularly tested benchmarks (Pascal VOC, Willow Object and SPair-71k) and it outperforms all previous state-of-the-arts on all benchmarks.
arXiv Detail & Related papers (2022-01-05T13:37:27Z) - TDRE: A Tensor Decomposition Based Approach for Relation Extraction [6.726803950083593]
Extracting entity pairs along with relation types from unstructured texts is a fundamental subtask of information extraction.
In this paper, we first model the final triplet extraction result as a three-order tensor of word-to-word pairs enriched with each relation type.
The proposed method outperforms existing strong baselines.
arXiv Detail & Related papers (2020-10-15T05:29:34Z) - Minimize Exposure Bias of Seq2Seq Models in Joint Entity and Relation
Extraction [57.22929457171352]
Joint entity and relation extraction aims to extract relation triplets from plain text directly.
We propose a novel Sequence-to-Unordered-Multi-Tree (Seq2UMTree) model to minimize the effects of exposure bias.
arXiv Detail & Related papers (2020-09-16T06:53:34Z) - Contrastive Triple Extraction with Generative Transformer [72.21467482853232]
We introduce a novel model, contrastive triple extraction with a generative transformer.
Specifically, we introduce a single shared transformer module for encoder-decoder-based generation.
To generate faithful results, we propose a novel triplet contrastive training object.
arXiv Detail & Related papers (2020-09-14T05:29:24Z) - Calibrated Adversarial Refinement for Stochastic Semantic Segmentation [5.849736173068868]
We present a strategy for learning a calibrated predictive distribution over semantic maps, where the probability associated with each prediction reflects its ground truth correctness likelihood.
We demonstrate the versatility and robustness of the approach by achieving state-of-the-art results on the multigrader LIDC dataset and on a modified Cityscapes dataset with injected ambiguities.
We show that the core design can be adapted to other tasks requiring learning a calibrated predictive distribution by experimenting on a toy regression dataset.
arXiv Detail & Related papers (2020-06-23T16:39:59Z) - Learn to Predict Sets Using Feed-Forward Neural Networks [63.91494644881925]
This paper addresses the task of set prediction using deep feed-forward neural networks.
We present a novel approach for learning to predict sets with unknown permutation and cardinality.
We demonstrate the validity of our set formulations on relevant vision problems.
arXiv Detail & Related papers (2020-01-30T01:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.