Related papers: What do You Mean by Relation Extraction? A Survey on Datasets and Study on Scientific Relation Classification

What do You Mean by Relation Extraction? A Survey on Datasets and Study on Scientific Relation Classification

URL: http://arxiv.org/abs/2204.13516v1
Date: Thu, 28 Apr 2022 14:07:25 GMT
Title: What do You Mean by Relation Extraction? A Survey on Datasets and Study on Scientific Relation Classification
Authors: Elisa Bassignana and Barbara Plank
Abstract summary: We present an empirical study on scientific Relation Classification across two datasets. Despite large data overlap, our analysis reveals substantial discrepancies in annotation. Variation within further sub-domains exists but impacts Relation Classification only limited degrees.
Score: 21.513743126525622
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Over the last five years, research on Relation Extraction (RE) witnessed extensive progress with many new dataset releases. At the same time, setup clarity has decreased, contributing to increased difficulty of reliable empirical evaluation (Taill\'e et al., 2020). In this paper, we provide a comprehensive survey of RE datasets, and revisit the task definition and its adoption by the community. We find that cross-dataset and cross-domain setups are particularly lacking. We present an empirical study on scientific Relation Classification across two datasets. Despite large data overlap, our analysis reveals substantial discrepancies in annotation. Annotation discrepancies strongly impact Relation Classification performance, explaining large drops in cross-dataset evaluations. Variation within further sub-domains exists but impacts Relation Classification only to limited degrees. Overall, our study calls for more rigour in reporting setups in RE and evaluation across multiple test sets.

Related papers

Variations in Relevance Judgments and the Shelf Life of Test Collections [50.060833338921945]
We reproduce prior work in the neural retrieval setting, showing that assessor disagreement does not affect system rankings.<n>We observe that some models substantially degrade with our new relevance judgments, and some have already reached the effectiveness of humans as rankers.
arXiv Detail & Related papers (2025-02-28T10:46:56Z)
Diversity Over Quantity: A Lesson From Few Shot Relation Classification [62.66895901654023]
We show that training on a diverse set of relations significantly enhances a model's ability to generalize to unseen relations. We introduce REBEL-FS, a new FSRC benchmark that incorporates an order of magnitude more relation types than existing datasets.
arXiv Detail & Related papers (2024-12-06T21:41:01Z)
Silver Syntax Pre-training for Cross-Domain Relation Extraction [20.603482820770356]
Relation Extraction (RE) remains a challenging task, especially when considering realistic out-of-domain evaluations. obtaining high-quality (manually annotated) data is extremely expensive and cannot realistically be repeated for each new domain. An intermediate training step on data from related tasks has shown to be beneficial across many NLP tasks.However, this setup still requires supplementary annotated data, which is often not available. In this paper, we investigate intermediate pre-training specifically for RE. We exploit the affinity between syntactic structure and semantic RE, and identify the syntactic relations closely related to RE by being on the shortest dependency path between two entities
arXiv Detail & Related papers (2023-05-18T14:49:19Z)
Discovering and Explaining the Non-Causality of Deep Learning in SAR ATR [20.662652637190515]
Deep learning has been widely used in SAR ATR and achieved excellent performance on the MSTAR dataset. In this paper, we quantify the contributions of different regions to target recognition based on the Shapley value. We explain how data bias and model bias contribute to non-causality.
arXiv Detail & Related papers (2023-04-03T00:45:11Z)
CrossRE: A Cross-Domain Dataset for Relation Extraction [21.513743126525622]
CrossRE is a new, freely-available cross-domain benchmark for Relation Extraction (RE) We provide an empirical evaluation with a state-of-the-art model for relation classification. As the meta-data enables us to shed new light on the state-of-the-art model, we provide a comprehensive analysis on the impact of difficult cases.
arXiv Detail & Related papers (2022-10-17T18:33:14Z)
Assaying Out-Of-Distribution Generalization in Transfer Learning [103.57862972967273]
We take a unified view of previous work, highlighting message discrepancies that we address empirically. We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting.
arXiv Detail & Related papers (2022-07-19T12:52:33Z)
A Closer Look at Debiased Temporal Sentence Grounding in Videos: Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video. Recent studies have found that current benchmark datasets may have obvious moment annotation biases. We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z)
Re-TACRED: Addressing Shortcomings of the TACRED Dataset [5.820381428297218]
TACRED is one of the largest and most widely used sentence-level relation extraction datasets. Proposed models that are evaluated using this dataset consistently set new state-of-the-art performance. However, they still exhibit large error rates despite leveraging external knowledge and unsupervised pretraining on large text corpora.
arXiv Detail & Related papers (2021-04-16T22:55:11Z)
CDEvalSumm: An Empirical Study of Cross-Dataset Evaluation for Neural Summarization Systems [121.78477833009671]
We investigate the performance of different summarization models under a cross-dataset setting. A comprehensive study of 11 representative summarization systems on 5 datasets from different domains reveals the effect of model architectures and generation ways.
arXiv Detail & Related papers (2020-10-11T02:19:15Z)
Let's Stop Incorrect Comparisons in End-to-end Relation Extraction! [13.207968737733196]
We first identify several patterns of invalid comparisons in published papers and describe them to avoid their propagation. We then propose a small empirical study to quantify the impact of the most common mistake and evaluate it leads to overestimating the final RE performance by around 5% on ACE05.
arXiv Detail & Related papers (2020-09-22T16:59:15Z)
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting. We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data. We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations. We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
Novel Human-Object Interaction Detection via Adversarial Domain Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.