A Positive-Unlabeled Metric Learning Framework for Document-Level
Relation Extraction with Incomplete Labeling
- URL: http://arxiv.org/abs/2306.14806v2
- Date: Thu, 25 Jan 2024 10:26:14 GMT
- Title: A Positive-Unlabeled Metric Learning Framework for Document-Level
Relation Extraction with Incomplete Labeling
- Authors: Ye Wang, Huazheng Pan, Tao Zhang, Wen Wu, Wenxin Hu
- Abstract summary: The goal of document-level relation extraction (RE) is to identify relations between entities that span multiple sentences.
We propose a positive-augmentation and positive-mixup positive-unlabeled metric learning framework (P3M)
P3M improves the F1 score by approximately 4-10 points in document-level RE with incomplete labeling.
- Score: 6.545730317972688
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of document-level relation extraction (RE) is to identify relations
between entities that span multiple sentences. Recently, incomplete labeling in
document-level RE has received increasing attention, and some studies have used
methods such as positive-unlabeled learning to tackle this issue, but there is
still a lot of room for improvement. Motivated by this, we propose a
positive-augmentation and positive-mixup positive-unlabeled metric learning
framework (P3M). Specifically, we formulate document-level RE as a metric
learning problem. We aim to pull the distance closer between entity pair
embedding and their corresponding relation embedding, while pushing it farther
away from the none-class relation embedding. Additionally, we adapt the
positive-unlabeled learning to this loss objective. In order to improve the
generalizability of the model, we use dropout to augment positive samples and
propose a positive-none-class mixup method. Extensive experiments show that P3M
improves the F1 score by approximately 4-10 points in document-level RE with
incomplete labeling, and achieves state-of-the-art results in fully labeled
scenarios. Furthermore, P3M has also demonstrated robustness to prior
estimation bias in incomplete labeled scenarios.
Related papers
- Cobra Effect in Reference-Free Image Captioning Metrics [58.438648377314436]
A proliferation of reference-free methods, leveraging visual-language pre-trained models (VLMs), has emerged.
In this paper, we study if there are any deficiencies in reference-free metrics.
We employ GPT-4V as an evaluative tool to assess generated sentences and the result reveals that our approach achieves state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2024-02-18T12:36:23Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Q-REG: End-to-End Trainable Point Cloud Registration with Surface
Curvature [81.25511385257344]
We present a novel solution, Q-REG, which utilizes rich geometric information to estimate the rigid pose from a single correspondence.
Q-REG allows to formalize the robust estimation as an exhaustive search, hence enabling end-to-end training.
We demonstrate in the experiments that Q-REG is agnostic to the correspondence matching method and provides consistent improvement both when used only in inference and in end-to-end training.
arXiv Detail & Related papers (2023-09-27T20:58:53Z) - Class-Adaptive Self-Training for Relation Extraction with Incompletely
Annotated Training Data [43.46328487543664]
Relation extraction (RE) aims to extract relations from sentences and documents.
Recent studies showed that many RE datasets are incompletely annotated.
This is known as the false negative problem in which valid relations are falsely annotated as 'no_relation'
arXiv Detail & Related papers (2023-06-16T09:01:45Z) - No Pairs Left Behind: Improving Metric Learning with Regularized Triplet
Objective [19.32706951298244]
We propose a novel formulation of the triplet objective function that improves metric learning without additional sample mining or overhead costs.
We show that our method (called No Pairs Left Behind [NPLB]) improves upon the traditional and current state-of-the-art triplet objective formulations.
arXiv Detail & Related papers (2022-10-18T00:56:01Z) - A Unified Positive-Unlabeled Learning Framework for Document-Level
Relation Extraction with Different Levels of Labeling [5.367772036988716]
Document-level relation extraction (RE) aims to identify relations between entities across multiple sentences.
We propose a unified positive-unlabeled learning framework - shift and squared ranking loss.
Our method achieves an improvement of about 14 F1 points relative to the previous baseline with incomplete labeling.
arXiv Detail & Related papers (2022-10-17T02:54:49Z) - A Theory-Driven Self-Labeling Refinement Method for Contrastive
Representation Learning [111.05365744744437]
Unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives.
In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination.
Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning.
arXiv Detail & Related papers (2021-06-28T14:24:52Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - Addressing Class Imbalance in Scene Graph Parsing by Learning to
Contrast and Score [65.18522219013786]
Scene graph parsing aims to detect objects in an image scene and recognize their relations.
Recent approaches have achieved high average scores on some popular benchmarks, but fail in detecting rare relations.
This paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem.
arXiv Detail & Related papers (2020-09-28T13:57:59Z) - MixPUL: Consistency-based Augmentation for Positive and Unlabeled
Learning [8.7382177147041]
We propose a simple yet effective data augmentation method, coinedalgo, based on emphconsistency regularization.
algoincorporates supervised and unsupervised consistency training to generate augmented data.
We show thatalgoachieves an averaged improvement of classification error from 16.49 to 13.09 on the CIFAR-10 dataset across different positive data amount.
arXiv Detail & Related papers (2020-04-20T15:43:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.