Lambda: Learning Matchable Prior For Entity Alignment with Unlabeled Dangling Cases
- URL: http://arxiv.org/abs/2403.10978v2
- Date: Sat, 09 Nov 2024 18:46:07 GMT
- Title: Lambda: Learning Matchable Prior For Entity Alignment with Unlabeled Dangling Cases
- Authors: Hang Yin, Liyao Xiang, Dong Ding, Yuheng He, Yihan Wu, Xinbing Wang, Chenghu Zhou,
- Abstract summary: We propose the framework textitLambda for dangling detection and entity alignment.
Lambda features a GNN-based encoder called KEESA with spectral contrastive learning for EA and a positive-unlabeled learning algorithm for dangling detection called iPULE.
- Score: 49.86384156476041
- License:
- Abstract: We investigate the entity alignment (EA) problem with unlabeled dangling cases, meaning that partial entities have no counterparts in the other knowledge graph (KG), and this type of entity remains unlabeled. To address this challenge, we propose the framework \textit{Lambda} for dangling detection and then entity alignment. Lambda features a GNN-based encoder called KEESA with spectral contrastive learning for EA and a positive-unlabeled learning algorithm for dangling detection called iPULE. iPULE offers theoretical guarantees of unbiasedness, uniform deviation bounds, and convergence. Experimental results demonstrate that each component contributes to overall performances that are superior to baselines, even when baselines additionally exploit 30\% of dangling entities labeled for training.
Related papers
- Unsupervised Robust Cross-Lingual Entity Alignment via Neighbor Triple Matching with Entity and Relation Texts [17.477542644785483]
Cross-lingual entity alignment (EA) enables the integration of multiple knowledge graphs (KGs) across different languages.
EA pipeline that jointly performs entity-level and Relation-level Alignment by neighbor triple matching strategy.
arXiv Detail & Related papers (2024-07-22T12:25:48Z) - Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning [49.417414031031264]
This paper studies learning fair encoders in a self-supervised learning setting.
All data are unlabeled and only a small portion of them are annotated with sensitive attributes.
arXiv Detail & Related papers (2024-06-09T08:11:12Z) - Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning [71.9954600831939]
Positive-Unlabeled (PU) learning is vital in many real-world scenarios, but its application to graph data remains under-explored.
We unveil that a critical challenge for PU learning on graph lies on the edge heterophily, which directly violates the irreducibility assumption for Class-Prior Estimation.
In response to this challenge, we introduce a new method, named Graph PU Learning with Label Propagation Loss (GPL)
arXiv Detail & Related papers (2024-05-30T10:30:44Z) - Robust Representation Learning for Unreliable Partial Label Learning [86.909511808373]
Partial Label Learning (PLL) is a type of weakly supervised learning where each training instance is assigned a set of candidate labels, but only one label is the ground-truth.
This is known as Unreliable Partial Label Learning (UPLL) that introduces an additional complexity due to the inherent unreliability and ambiguity of partial labels.
We propose the Unreliability-Robust Representation Learning framework (URRL) that leverages unreliability-robust contrastive learning to help the model fortify against unreliable partial labels effectively.
arXiv Detail & Related papers (2023-08-31T13:37:28Z) - LightEA: A Scalable, Robust, and Interpretable Entity Alignment
Framework via Three-view Label Propagation [27.483109233276632]
We argue that existing GNN-based EA methods inherit the inborn defects from their neural network lineage: weak scalability and poor interpretability.
We propose a non-neural EA framework -- LightEA, consisting of three efficient components: (i) Random Orthogonal Label Generation, (ii) Three-view Label propagation, and (iii) Sparse Sinkhorn Iteration.
According to the extensive experiments on public datasets, LightEA has impressive scalability, robustness, and interpretability.
arXiv Detail & Related papers (2022-10-19T10:07:08Z) - Optimizing Bi-Encoder for Named Entity Recognition via Contrastive
Learning [80.36076044023581]
We present an efficient bi-encoder framework for named entity recognition (NER)
We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type.
A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
arXiv Detail & Related papers (2022-08-30T23:19:04Z) - Nested Named Entity Recognition as Latent Lexicalized Constituency
Parsing [29.705133932275892]
Recently, (Fu et al, 2021) adapt a span-based constituency to tackle nested NER.
In this work, we resort to more expressive structures, lexicalized constituency trees in which constituents are annotated by headwords.
We leverage the Eisner-Satta algorithm to perform partial marginalization and inference efficiently.
arXiv Detail & Related papers (2022-03-09T12:02:59Z) - ActiveEA: Active Learning for Neural Entity Alignment [31.212894129845093]
Entity alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs)
Current mainstream methods -- neural EA models -- rely on training with seed alignment, i.e., a set of pre-aligned entity pairs.
We devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment.
arXiv Detail & Related papers (2021-10-13T03:38:04Z) - Towards Entity Alignment in the Open World: An Unsupervised Approach [29.337157862514204]
It is a pivotal step for integrating knowledge graphs (KGs) to increase knowledge coverage and quality.
State-of-the-art solutions tend to rely on labeled data for model training.
We offer an unsupervised framework that performs entity alignment in the open world.
arXiv Detail & Related papers (2021-01-26T03:10:24Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.