Silver Syntax Pre-training for Cross-Domain Relation Extraction
- URL: http://arxiv.org/abs/2305.11016v1
- Date: Thu, 18 May 2023 14:49:19 GMT
- Title: Silver Syntax Pre-training for Cross-Domain Relation Extraction
- Authors: Elisa Bassignana, Filip Ginter, Sampo Pyysalo, Rob van der Goot, and
Barbara Plank
- Abstract summary: Relation Extraction (RE) remains a challenging task, especially when considering realistic out-of-domain evaluations.
obtaining high-quality (manually annotated) data is extremely expensive and cannot realistically be repeated for each new domain.
An intermediate training step on data from related tasks has shown to be beneficial across many NLP tasks.However, this setup still requires supplementary annotated data, which is often not available.
In this paper, we investigate intermediate pre-training specifically for RE. We exploit the affinity between syntactic structure and semantic RE, and identify the syntactic relations closely related to RE by being on the shortest dependency path between two entities
- Score: 20.603482820770356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Relation Extraction (RE) remains a challenging task, especially when
considering realistic out-of-domain evaluations. One of the main reasons for
this is the limited training size of current RE datasets: obtaining
high-quality (manually annotated) data is extremely expensive and cannot
realistically be repeated for each new domain. An intermediate training step on
data from related tasks has shown to be beneficial across many NLP
tasks.However, this setup still requires supplementary annotated data, which is
often not available. In this paper, we investigate intermediate pre-training
specifically for RE. We exploit the affinity between syntactic structure and
semantic RE, and identify the syntactic relations which are closely related to
RE by being on the shortest dependency path between two entities. We then take
advantage of the high accuracy of current syntactic parsers in order to
automatically obtain large amounts of low-cost pre-training data. By
pre-training our RE model on the relevant syntactic relations, we are able to
outperform the baseline in five out of six cross-domain setups, without any
additional annotated data.
Related papers
- Understanding Synthetic Context Extension via Retrieval Heads [51.8869530817334]
We investigate fine-tuning on synthetic data for three long-context tasks that require retrieval and reasoning.
We find that models trained on synthetic data fall short of the real data, but surprisingly, the mismatch can be interpreted.
Our results shed light on how to interpret synthetic data fine-tuning performance and how to approach creating better data for learning real-world capabilities over long contexts.
arXiv Detail & Related papers (2024-10-29T17:55:00Z) - PromptORE -- A Novel Approach Towards Fully Unsupervised Relation
Extraction [0.0]
Unsupervised Relation Extraction (RE) aims to identify relations between entities in text, without having access to labeled data during training.
We propose PromptORE, a ''Prompt-based Open Relation Extraction'' model.
We adapt the novel prompt-tuning paradigm to work in an unsupervised setting, and use it to embed sentences expressing a relation.
We show that PromptORE consistently outperforms state-of-the-art models with a relative gain of more than 40% in B 3, V-measure and ARI.
arXiv Detail & Related papers (2023-03-24T12:55:35Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study [51.33182775762785]
This paper presents an empirical study to build relation extraction systems in low-resource settings.
We investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; and (iii) data augmentation technologies and self-training to generate more labeled in-domain data.
arXiv Detail & Related papers (2022-10-19T15:46:37Z) - Automatically Generating Counterfactuals for Relation Exaction [18.740447044960796]
relation extraction (RE) is a fundamental task in natural language processing.
Current deep neural models have achieved high accuracy but are easily affected by spurious correlations.
We develop a novel approach to derive contextual counterfactuals for entities.
arXiv Detail & Related papers (2022-02-22T04:46:10Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Relation of the Relations: A New Paradigm of the Relation Extraction
Problem [52.21210549224131]
We propose a new paradigm of Relation Extraction (RE) that considers as a whole the predictions of all relations in the same context.
We develop a data-driven approach that does not require hand-crafted rules but learns by itself the relation of relations (RoR) using Graph Neural Networks and a relation matrix transformer.
Experiments show that our model outperforms the state-of-the-art approaches by +1.12% on the ACE05 dataset and +2.55% on SemEval 2018 Task 7.2.
arXiv Detail & Related papers (2020-06-05T22:25:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.