Hybrid Attention-Based Transformer Block Model for Distant Supervision
Relation Extraction
- URL: http://arxiv.org/abs/2003.11518v2
- Date: Thu, 26 Mar 2020 09:04:45 GMT
- Title: Hybrid Attention-Based Transformer Block Model for Distant Supervision
Relation Extraction
- Authors: Yan Xiao, Yaochu Jin, Ran Cheng, Kuangrong Hao
- Abstract summary: We propose a new framework using hybrid attention-based Transformer block with multi-instance learning to perform the DSRE task.
The proposed approach can outperform the state-of-the-art algorithms on the evaluation dataset.
- Score: 20.644215991166902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With an exponential explosive growth of various digital text information, it
is challenging to efficiently obtain specific knowledge from massive
unstructured text information. As one basic task for natural language
processing (NLP), relation extraction aims to extract the semantic relation
between entity pairs based on the given text. To avoid manual labeling of
datasets, distant supervision relation extraction (DSRE) has been widely used,
aiming to utilize knowledge base to automatically annotate datasets.
Unfortunately, this method heavily suffers from wrong labelling due to the
underlying strong assumptions. To address this issue, we propose a new
framework using hybrid attention-based Transformer block with multi-instance
learning to perform the DSRE task. More specifically, the Transformer block is
firstly used as the sentence encoder to capture syntactic information of
sentences, which mainly utilizes multi-head self-attention to extract features
from word level. Then, a more concise sentence-level attention mechanism is
adopted to constitute the bag representation, aiming to incorporate valid
information of each sentence to effectively represent the bag. Experimental
results on the public dataset New York Times (NYT) demonstrate that the
proposed approach can outperform the state-of-the-art algorithms on the
evaluation dataset, which verifies the effectiveness of our model for the DSRE
task.
Related papers
- Text2Data: Low-Resource Data Generation with Textual Control [104.38011760992637]
Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines.
We propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model.
It undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation [35.33340453046864]
Chain-of-Thought Attribute Manipulation (CoTAM) is a novel approach that generates new data from existing examples.
We leverage the chain-of-thought prompting to directly edit the text in three steps, (1) attribute decomposition, (2) manipulation proposal, and (3) sentence reconstruction.
arXiv Detail & Related papers (2023-07-14T00:10:03Z) - Boosting Event Extraction with Denoised Structure-to-Text Augmentation [52.21703002404442]
Event extraction aims to recognize pre-defined event triggers and arguments from texts.
Recent data augmentation methods often neglect the problem of grammatical incorrectness.
We propose a denoised structure-to-text augmentation framework for event extraction DAEE.
arXiv Detail & Related papers (2023-05-16T16:52:07Z) - Meeting Summarization with Pre-training and Clustering Methods [6.47783315109491]
HMNetcitehmnet is a hierarchical network that employs both a word-level transformer and a turn-level transformer, as the baseline.
We extend the locate-then-summarize approach of QMSumciteqmsum with an intermediate clustering step.
We compare the performance of our baseline models with BART, a state-of-the-art language model that is effective for summarization.
arXiv Detail & Related papers (2021-11-16T03:14:40Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Improving BERT Model Using Contrastive Learning for Biomedical Relation
Extraction [13.354066085659198]
Contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data.
In this work, we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation extraction.
The experimental results on three relation extraction benchmark datasets demonstrate that our method can improve the BERT model representation and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-04-28T17:50:24Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - An Effective Contextual Language Modeling Framework for Speech
Summarization with Augmented Features [13.97006782398121]
Bidirectional Representations from Transformers (BERT) model was proposed and has achieved record-breaking success on many natural language processing tasks.
We explore the incorporation of confidence scores into sentence representations to see if such an attempt could help alleviate the negative effects caused by imperfect automatic speech recognition.
We validate the effectiveness of our proposed method on a benchmark dataset.
arXiv Detail & Related papers (2020-06-01T18:27:48Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.