High-quality Task Division for Large-scale Entity Alignment
- URL: http://arxiv.org/abs/2208.10366v1
- Date: Mon, 22 Aug 2022 14:46:38 GMT
- Title: High-quality Task Division for Large-scale Entity Alignment
- Authors: Bing Liu, Wen Hua, Guido Zuccon, Genghong Zhao, Xia Zhang
- Abstract summary: DivEA achieves higher EA performance than alternative state-of-the-art solutions.
We devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models.
- Score: 28.001266850114643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Entity Alignment (EA) aims to match equivalent entities that refer to the
same real-world objects and is a key step for Knowledge Graph (KG) fusion. Most
neural EA models cannot be applied to large-scale real-life KGs due to their
excessive consumption of GPU memory and time. One promising solution is to
divide a large EA task into several subtasks such that each subtask only needs
to match two small subgraphs of the original KGs. However, it is challenging to
divide the EA task without losing effectiveness. Existing methods display low
coverage of potential mappings, insufficient evidence in context graphs, and
largely differing subtask sizes.
In this work, we design the DivEA framework for large-scale EA with
high-quality task division. To include in the EA subtasks a high proportion of
the potential mappings originally present in the large EA task, we devise a
counterpart discovery method that exploits the locality principle of the EA
task and the power of trained EA models. Unique to our counterpart discovery
method is the explicit modelling of the chance of a potential mapping. We also
introduce an evidence passing mechanism to quantify the informativeness of
context entities and find the most informative context graphs with flexible
control of the subtask size. Extensive experiments show that DivEA achieves
higher EA performance than alternative state-of-the-art solutions.
Related papers
- Aligning Multiple Knowledge Graphs in a Single Pass [16.396492040719657]
We propose an effective framework named MultiEA to solve the problem of aligning multiple knowledge graphs.
In particular, we propose an innovative inference enhancement technique to improve the alignment performance.
The results show that our MultiEA can effectively and efficiently align multiple KGs in a single pass.
arXiv Detail & Related papers (2024-08-01T15:58:05Z) - Localizing Task Information for Improved Model Merging and Compression [61.16012721460561]
We show that the information required to solve each task is still preserved after merging as different tasks mostly use non-overlapping sets of weights.
We propose Consensus Merging, an algorithm that eliminates such weights and improves the general performance of existing model merging approaches.
arXiv Detail & Related papers (2024-05-13T14:54:37Z) - Understanding and Guiding Weakly Supervised Entity Alignment with Potential Isomorphism Propagation [31.558938631213074]
We present a propagation perspective to analyze weakly supervised EA.
We show that aggregation-based EA models seek propagation operators for pairwise entity similarities.
We develop a general EA framework, PipEA, incorporating this operator to improve the accuracy of every type of aggregation-based model.
arXiv Detail & Related papers (2024-02-05T14:06:15Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Toward Practical Entity Alignment Method Design: Insights from New
Highly Heterogeneous Knowledge Graph Datasets [32.68422342604253]
We study the performance of entity alignment (EA) methods in practical settings, specifically focusing on the alignment of highly heterogeneous KGs (HHKGs)
Our findings reveal that, in aligning HHKGs, valuable structure information can hardly be exploited through message-passing and aggregation mechanisms.
These findings shed light on the potential problems associated with the conventional application of GNN-based methods as a panacea for all EA datasets.
arXiv Detail & Related papers (2023-04-07T04:10:26Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - ClusterEA: Scalable Entity Alignment with Stochastic Training and
Normalized Mini-batch Similarities [26.724014626196322]
ClusterEA is capable of scaling up EA models and enhancing their results by leveraging normalization methods on mini-batches.
It first trains a large-scale GNN for EA in a fashion to produce entity embeddings.
Based on the embeddings, a novel ClusterSampler strategy is proposed for sampling highly overlapped mini-batches.
arXiv Detail & Related papers (2022-05-20T17:29:50Z) - Good Visual Guidance Makes A Better Extractor: Hierarchical Visual
Prefix for Multimodal Entity and Relation Extraction [88.6585431949086]
We propose a novel Hierarchical Visual Prefix fusion NeTwork (HVPNeT) for visual-enhanced entity and relation extraction.
We regard visual representation as pluggable visual prefix to guide the textual representation for error insensitive forecasting decision.
Experiments on three benchmark datasets demonstrate the effectiveness of our method, and achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-05-07T02:10:55Z) - An Accurate Unsupervised Method for Joint Entity Alignment and Dangling
Entity Detection [0.3965019866400874]
We propose a novel accurate Unsupervised method for joint Entity alignment (EA) and Dangling entity detection (DED)
We construct a medical cross-lingual knowledge graph dataset, MedED, providing data for both the EA and DED tasks.
In the EA task, UED achieves EA results comparable to those of state-of-the-art supervised EA baselines and outperforms the current state-of-the-art EA methods by combining supervised EA data.
arXiv Detail & Related papers (2022-03-10T04:08:53Z) - Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results.
Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples.
Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.