Related papers: Self-attention Presents Low-dimensional Knowledge Graph Embeddings for Link Prediction

Self-attention Presents Low-dimensional Knowledge Graph Embeddings for Link Prediction

URL: http://arxiv.org/abs/2112.10644v1
Date: Mon, 20 Dec 2021 16:11:01 GMT
Title: Self-attention Presents Low-dimensional Knowledge Graph Embeddings for Link Prediction
Authors: Peyman Baghershahi, Reshad Hosseini, Hadi Moradi
Abstract summary: Self-attention is the key to applying query-dependant projections to entities and relations. Our model achieves favorably comparable or better performance than our three best recent state-of-the-art competitors.
Score: 6.789370732159177
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, link prediction problem, also known as knowledge graph completion, has attracted lots of researches. Even though there are few recent models tried to attain relatively good performance by embedding knowledge graphs in low dimensions, the best results of the current state-of-the-art models are earned at the cost of considerably increasing the dimensionality of embeddings. However, this causes overfitting and more importantly scalability issues in case of huge knowledge bases. Inspired by the recent advances in deep learning offered by variants of the Transformer model, because of its self-attention mechanism, in this paper we propose a model based on it to address the aforementioned limitation. In our model, self-attention is the key to applying query-dependant projections to entities and relations, and capturing the mutual information between them to gain highly expressive representations from low-dimensional embeddings. Empirical results on two standard link prediction datasets, FB15k-237 and WN18RR, demonstrate that our model achieves favorably comparable or better performance than our three best recent state-of-the-art competitors, with a significant reduction of 76.3% in the dimensionality of embeddings on average.

Related papers

On Large-scale Evaluation of Embedding Models for Knowledge Graph Completion [1.2703808802607108]
Knowledge graph embedding (KGE) models are extensively studied for knowledge graph completion, yet their evaluation remains constrained by unrealistic benchmarks. Standard evaluation metrics rely on the closed-world assumption, which penalizes models for correctly predicting missing triples. This paper conducts a comprehensive evaluation of four representative KGE models on large-scale datasets FB-CVT-REV and FB+CVT-REV.
arXiv Detail & Related papers (2025-04-11T20:49:02Z)
Distillation of Diffusion Features for Semantic Correspondence [23.54555663670558]
We propose a novel knowledge distillation technique to overcome the problem of reduced efficiency. We show how to use two large vision foundation models and distill the capabilities of these complementary models into one smaller model that maintains high accuracy at reduced computational cost. Our empirical results demonstrate that our distilled model with 3D data augmentation achieves performance superior to current state-of-the-art methods while significantly reducing computational load and enhancing practicality for real-world applications, such as semantic video correspondence.
arXiv Detail & Related papers (2024-12-04T17:55:33Z)
A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models. Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning. We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z)
STLM Engineering Report: Dropout [4.3600359083731695]
We find that dropout remains effective in the overfitting scenario, and that it may have some relevance for improving the fit of models even in the case of excess data. In the process we find that the existing explanation for the mechanism behind this performance gain is not applicable in the case of language modelling.
arXiv Detail & Related papers (2024-09-09T08:24:29Z)
On the Embedding Collapse when Scaling up Recommendation Models [53.66285358088788]
We identify the embedding collapse phenomenon as the inhibition of scalability, wherein the embedding matrix tends to occupy a low-dimensional subspace. We propose a simple yet effective multi-embedding design incorporating embedding-set-specific interaction modules to learn embedding sets with large diversity.
arXiv Detail & Related papers (2023-10-06T17:50:38Z)
A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search. We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z)
Learning Representations of Entities and Relations [0.0]
This thesis focuses on improving knowledge graph representation with the aim of tackling the link prediction task. The first contribution is HypER, a convolutional model which simplifies and improves upon the link prediction performance. The second contribution is TuckER, a relatively straightforward linear model, which, at the time of its introduction, obtained state-of-the-art link prediction performance. The third contribution is MuRP, first multi-relational graph representation model embedded in hyperbolic space.
arXiv Detail & Related papers (2022-01-31T09:24:43Z)
Leveraging Static Models for Link Prediction in Temporal Knowledge Graphs [0.0]
We show that SpliMe competes with or outperforms the current state of the art in temporal KGE. We uncover issues with the procedure currently used to assess the performance of static models on temporal graphs.
arXiv Detail & Related papers (2021-06-29T10:15:17Z)
Knowledge distillation: A good teacher is patient and consistent [71.14922743774864]
There is a growing discrepancy in computer vision between large-scale models that achieve state-of-the-art performance and models that are affordable in practical applications. We identify certain implicit design choices, which may drastically affect the effectiveness of distillation. We obtain a state-of-the-art ResNet-50 model for ImageNet, which achieves 82.8% top-1 accuracy.
arXiv Detail & Related papers (2021-06-09T17:20:40Z)
Modeling Object Dissimilarity for Deep Saliency Prediction [86.14710352178967]
We introduce a detection-guided saliency prediction network that explicitly models the differences between multiple objects. Our approach is general, allowing us to fuse our object dissimilarities with features extracted by any deep saliency prediction network.
arXiv Detail & Related papers (2021-04-08T16:10:37Z)
Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning. We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels. Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z)
Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study [0.0]
This paper is the first systematic study with the main objective of assessing the true effectiveness of embedding models. Our experiment results show these models are much less accurate than what we used to perceive.
arXiv Detail & Related papers (2020-03-18T01:18:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.