Combating Confirmation Bias: A Unified Pseudo-Labeling Framework for
Entity Alignment
- URL: http://arxiv.org/abs/2307.02075v1
- Date: Wed, 5 Jul 2023 07:32:34 GMT
- Title: Combating Confirmation Bias: A Unified Pseudo-Labeling Framework for
Entity Alignment
- Authors: Qijie Ding, Jie Yin, Daokun Zhang and Junbin Gao
- Abstract summary: We propose a Unified Pseudo-Labeling framework for Entity Alignment (UPL-EA)
UPL-EA explicitly eliminates pseudo-labeling errors to boost the accuracy of entity alignment.
The effectiveness of UPL-EA in eliminating pseudo-labeling errors is both theoretically supported and experimentally validated.
- Score: 29.18600512063287
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Entity alignment (EA) aims at identifying equivalent entity pairs across
different knowledge graphs (KGs) that refer to the same real-world identity. To
systematically combat confirmation bias for pseudo-labeling-based entity
alignment, we propose a Unified Pseudo-Labeling framework for Entity Alignment
(UPL-EA) that explicitly eliminates pseudo-labeling errors to boost the
accuracy of entity alignment. UPL-EA consists of two complementary components:
(1) The Optimal Transport (OT)-based pseudo-labeling uses discrete OT modeling
as an effective means to enable more accurate determination of entity
correspondences across two KGs and to mitigate the adverse impact of erroneous
matches. A simple but highly effective criterion is further devised to derive
pseudo-labeled entity pairs that satisfy one-to-one correspondences at each
iteration. (2) The cross-iteration pseudo-label calibration operates across
multiple consecutive iterations to further improve the pseudo-labeling
precision rate by reducing the local pseudo-label selection variability with a
theoretical guarantee. The two components are respectively designed to
eliminate Type I and Type II pseudo-labeling errors identified through our
analyse. The calibrated pseudo-labels are thereafter used to augment prior
alignment seeds to reinforce subsequent model training for alignment inference.
The effectiveness of UPL-EA in eliminating pseudo-labeling errors is both
theoretically supported and experimentally validated. The experimental results
show that our approach achieves competitive performance with limited prior
alignment seeds.
Related papers
- Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object
Detection with Repeated Labels [6.872072177648135]
We propose a novel localization algorithm that adapts well-established ground truth estimation methods.
Our algorithm also shows superior performance during training on the TexBiG dataset.
arXiv Detail & Related papers (2023-09-18T13:08:44Z) - Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
Recognition [49.42732949233184]
When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition.
Taking noisy labels as ground-truth in the loss function results in suboptimal performance.
We propose a novel framework named alternative pseudo-labeling to tackle the issue of noisy pseudo-labels.
arXiv Detail & Related papers (2023-08-12T12:13:52Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly
Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels.
We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z) - Multi-Label Gold Asymmetric Loss Correction with Single-Label Regulators [6.129273021888717]
We propose a novel Gold Asymmetric Loss Correction with Single-Label Regulators (GALC-SLR) that operates robust against noisy labels.
GALC-SLR estimates the noise confusion matrix using single-label samples, then constructs an asymmetric loss correction via estimated confusion matrix to avoid overfitting to the noisy labels.
Empirical results show that our method outperforms the state-of-the-art original asymmetric loss multi-label classifier under all corruption levels.
arXiv Detail & Related papers (2021-08-04T12:57:29Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.