Open Knowledge Base Canonicalization with Multi-task Learning
- URL: http://arxiv.org/abs/2403.14733v1
- Date: Thu, 21 Mar 2024 08:03:46 GMT
- Title: Open Knowledge Base Canonicalization with Multi-task Learning
- Authors: Bingchen Liu, Huang Peng, Weixin Zeng, Xiang Zhao, Shijun Liu, Li Pan,
- Abstract summary: Large open knowledge bases (OKBs) are integral to many knowledge-driven applications on the world wide web such as web search.
noun phrases and relational phrases in OKBs often suffer from redundancy and ambiguity, which calls for the investigation on OKB canonicalization.
Current solutions address OKB canonicalization by devising advanced clustering algorithms and using knowledge graph embedding (KGE) to further facilitate the canonicalization process.
We put forward a multi-task learning framework, namely MulCanon, to tackle OKB canonicalization.
- Score: 18.053863554106307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The construction of large open knowledge bases (OKBs) is integral to many knowledge-driven applications on the world wide web such as web search. However, noun phrases and relational phrases in OKBs often suffer from redundancy and ambiguity, which calls for the investigation on OKB canonicalization. Current solutions address OKB canonicalization by devising advanced clustering algorithms and using knowledge graph embedding (KGE) to further facilitate the canonicalization process. Nevertheless, these works fail to fully exploit the synergy between clustering and KGE learning, and the methods designed for these subtasks are sub-optimal. To this end, we put forward a multi-task learning framework, namely MulCanon, to tackle OKB canonicalization. In addition, diffusion model is used in the soft clustering process to improve the noun phrase representations with neighboring information, which can lead to more accurate representations. MulCanon unifies the learning objectives of these sub-tasks, and adopts a two-stage multi-task learning paradigm for training. A thorough experimental study on popular OKB canonicalization benchmarks validates that MulCanon can achieve competitive canonicalization results.
Related papers
- Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - Open Knowledge Base Canonicalization with Multi-task Unlearning [19.130159457887]
MulCanon is a multi-task unlearning framework to tackle machine unlearning problem in OKB canonicalization.
A thorough experimental study on popular OKB canonicalization datasets validates that MulCanon achieves advanced machine unlearning effects.
arXiv Detail & Related papers (2023-10-25T07:13:06Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Global Knowledge Calibration for Fast Open-Vocabulary Segmentation [124.74256749281625]
We introduce a text diversification strategy that generates a set of synonyms for each training category.
We also employ a text-guided knowledge distillation method to preserve the generalizable knowledge of CLIP.
Our proposed model achieves robust generalization performance across various datasets.
arXiv Detail & Related papers (2023-03-16T09:51:41Z) - Neural Coreference Resolution based on Reinforcement Learning [53.73316523766183]
Coreference resolution systems need to solve two subtasks.
One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention.
We propose a reinforcement learning actor-critic-based neural coreference resolution system.
arXiv Detail & Related papers (2022-12-18T07:36:35Z) - Joint Open Knowledge Base Canonicalization and Linking [24.160755953937763]
noun phrases (NPs) and relation phrases (RPs) in Open Knowledge Bases are not canonicalized.
We propose a novel framework JOCL based on factor graph model to make them reinforce each other.
arXiv Detail & Related papers (2022-12-02T14:38:58Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - CLOP: Video-and-Language Pre-Training with Knowledge Regularizations [43.09248976105326]
Video-and-language pre-training has shown promising results for learning generalizable representations.
We denote such form of representations as structural knowledge, which express rich semantics of multiple granularities.
We propose a Cross-modaL knedgeOwl-enhanced Pre-training (CLOP) method with Knowledge Regularizations.
arXiv Detail & Related papers (2022-11-07T05:32:12Z) - Multi-View Clustering for Open Knowledge Base Canonicalization [9.976636206355394]
Noun phrases and relation phrases in large open knowledge bases (OKBs) are not canonicalized.
We propose CMVC, a novel unsupervised framework that leverages two views of knowledge jointly for canonicalizing OKBs.
We demonstrate the superiority of our framework through extensive experiments on multiple real-world OKB data sets against state-of-the-art methods.
arXiv Detail & Related papers (2022-06-22T14:23:16Z) - Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts.
To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts.
However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z) - Joint Entity and Relation Canonicalization in Open Knowledge Graphs
using Variational Autoencoders [11.259587284318835]
Noun phrases and relation phrases in open knowledge graphs are not canonicalized, leading to an explosion of redundant and ambiguous subject-relation-object triples.
Existing approaches to face this problem take a two-step approach: first, they generate embedding representations for both noun and relation phrases, then a clustering algorithm is used to group them using the embeddings as features.
In this work, we propose Canonicalizing Using Variational AutoEncoders (CUVA), a joint model to learn both embeddings and cluster assignments in an end-to-end approach.
arXiv Detail & Related papers (2020-12-08T22:58:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.