Active Testing: An Unbiased Evaluation Method for Distantly Supervised
Relation Extraction
- URL: http://arxiv.org/abs/2010.08777v1
- Date: Sat, 17 Oct 2020 12:29:09 GMT
- Title: Active Testing: An Unbiased Evaluation Method for Distantly Supervised
Relation Extraction
- Authors: Pengshuai Li, Xinsong Zhang, Weijia Jia and Wei Zhao
- Abstract summary: We propose a novel evaluation method named active testing through utilizing both the noisy test set and a few manual annotations.
Experiments on a widely used benchmark show that our proposed approach can yield approximately unbiased evaluations for distantly supervised relation extractors.
- Score: 23.262284507381757
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Distant supervision has been a widely used method for neural relation
extraction for its convenience of automatically labeling datasets. However,
existing works on distantly supervised relation extraction suffer from the low
quality of test set, which leads to considerable biased performance evaluation.
These biases not only result in unfair evaluations but also mislead the
optimization of neural relation extraction. To mitigate this problem, we
propose a novel evaluation method named active testing through utilizing both
the noisy test set and a few manual annotations. Experiments on a widely used
benchmark show that our proposed approach can yield approximately unbiased
evaluations for distantly supervised relation extractors.
Related papers
- Rethinking Relation Extraction: Beyond Shortcuts to Generalization with a Debiased Benchmark [53.876493664396506]
Benchmarks are crucial for evaluating machine learning algorithm performance, facilitating comparison and identifying superior solutions.
This paper addresses the issue of entity bias in relation extraction tasks, where models tend to rely on entity mentions rather than context.
We propose a debiased relation extraction benchmark DREB that breaks the pseudo-correlation between entity mentions and relation types through entity replacement.
To establish a new baseline on DREB, we introduce MixDebias, a debiasing method combining data-level and model training-level techniques.
arXiv Detail & Related papers (2025-01-02T17:01:06Z) - Post Launch Evaluation of Policies in a High-Dimensional Setting [4.710921988115686]
A/B tests, also known as randomized controlled experiments (RCTs), are the gold standard for evaluating the impact of new policies, products, or decisions.
This paper explores practical considerations in applying methodologies inspired by "synthetic control"
Synthetic control methods leverage data from unaffected units to estimate counterfactual outcomes for treated units.
arXiv Detail & Related papers (2024-12-30T19:35:29Z) - Adaptive Experimentation When You Can't Experiment [55.86593195947978]
This paper introduces the emphconfounded pure exploration transductive linear bandit (textttCPET-LB) problem.
Online services can employ a properly randomized encouragement that incentivizes users toward a specific treatment.
arXiv Detail & Related papers (2024-06-15T20:54:48Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data [11.217084610985674]
We address the issue of binary classification from positive and unlabeled data (PU classification) with a selection bias in the positive data.
This scenario represents a conceptual framework for many practical applications, such as recommender systems.
We propose a method to identify the function of interest using a strong ignorability assumption and develop an Automatic Debiased PUE'' (ADPUE) learning method.
arXiv Detail & Related papers (2023-03-08T18:45:22Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Revisiting the Negative Data of Distantly Supervised Relation Extraction [17.00557139562208]
Distantly supervision automatically generates plenty of training samples for relation extraction.
It also incurs two major problems: noisy labels and imbalanced training data.
We propose a pipeline approach, dubbed textscReRe, that performs sentence-level relation detection then subject/object extraction.
arXiv Detail & Related papers (2021-05-21T06:44:19Z) - Active Testing: Sample-Efficient Model Evaluation [39.200332879659456]
We introduce active testing: a new framework for sample-efficient model evaluation.
Active testing addresses this by carefully selecting the test points to label.
We show how to remove that bias while reducing the variance of the estimator.
arXiv Detail & Related papers (2021-03-09T10:20:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.