A Unified Positive-Unlabeled Learning Framework for Document-Level
Relation Extraction with Different Levels of Labeling
- URL: http://arxiv.org/abs/2210.08709v1
- Date: Mon, 17 Oct 2022 02:54:49 GMT
- Title: A Unified Positive-Unlabeled Learning Framework for Document-Level
Relation Extraction with Different Levels of Labeling
- Authors: Ye Wang, Xinxin Liu, Wenxin Hu, Tao Zhang
- Abstract summary: Document-level relation extraction (RE) aims to identify relations between entities across multiple sentences.
We propose a unified positive-unlabeled learning framework - shift and squared ranking loss.
Our method achieves an improvement of about 14 F1 points relative to the previous baseline with incomplete labeling.
- Score: 5.367772036988716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document-level relation extraction (RE) aims to identify relations between
entities across multiple sentences. Most previous methods focused on
document-level RE under full supervision. However, in real-world scenario, it
is expensive and difficult to completely label all relations in a document
because the number of entity pairs in document-level RE grows quadratically
with the number of entities. To solve the common incomplete labeling problem,
we propose a unified positive-unlabeled learning framework - shift and squared
ranking loss positive-unlabeled (SSR-PU) learning. We use positive-unlabeled
(PU) learning on document-level RE for the first time. Considering that labeled
data of a dataset may lead to prior shift of unlabeled data, we introduce a PU
learning under prior shift of training data. Also, using none-class score as an
adaptive threshold, we propose squared ranking loss and prove its Bayesian
consistency with multi-label ranking metrics. Extensive experiments demonstrate
that our method achieves an improvement of about 14 F1 points relative to the
previous baseline with incomplete labeling. In addition, it outperforms
previous state-of-the-art results under both fully supervised and extremely
unlabeled settings as well.
Related papers
- Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object
Detection with Repeated Labels [6.872072177648135]
We propose a novel localization algorithm that adapts well-established ground truth estimation methods.
Our algorithm also shows superior performance during training on the TexBiG dataset.
arXiv Detail & Related papers (2023-09-18T13:08:44Z) - A Positive-Unlabeled Metric Learning Framework for Document-Level
Relation Extraction with Incomplete Labeling [6.545730317972688]
The goal of document-level relation extraction (RE) is to identify relations between entities that span multiple sentences.
We propose a positive-augmentation and positive-mixup positive-unlabeled metric learning framework (P3M)
P3M improves the F1 score by approximately 4-10 points in document-level RE with incomplete labeling.
arXiv Detail & Related papers (2023-06-26T16:05:59Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - GaussianMLR: Learning Implicit Class Significance via Calibrated
Multi-Label Ranking [0.0]
We propose a novel multi-label ranking method: GaussianMLR.
It aims to learn implicit class significance values that determine the positive label ranks.
We show that our method is able to accurately learn a representation of the incorporated positive rank order.
arXiv Detail & Related papers (2023-03-07T14:09:08Z) - None Class Ranking Loss for Document-Level Relation Extraction [22.173080823450498]
Document-level relation extraction (RE) aims at extracting relations among entities expressed across multiple sentences.
In a typical document, most entity pairs do not express any pre-defined relation and are labeled as "none" or "no relation"
arXiv Detail & Related papers (2022-05-01T14:24:37Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - Pointwise Binary Classification with Pairwise Confidence Comparisons [97.79518780631457]
We propose pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other.
We link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization.
arXiv Detail & Related papers (2020-10-05T09:23:58Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.