Comprehensive Analysis of Negative Sampling in Knowledge Graph
Representation Learning
- URL: http://arxiv.org/abs/2206.10140v1
- Date: Tue, 21 Jun 2022 06:51:33 GMT
- Title: Comprehensive Analysis of Negative Sampling in Knowledge Graph
Representation Learning
- Authors: Hidetaka Kamigaito, Katsuhiko Hayashi
- Abstract summary: Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities.
We theoretically analyzed NS loss to assist hyper parameter tuning and understand the better use of the NS loss in KGE learning.
Our empirical analysis on the FB15k-237, WN18RR, and YAGO3-10 datasets showed that the results of actually trained models agree with our theoretical findings.
- Score: 25.664174172917345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Negative sampling (NS) loss plays an important role in learning knowledge
graph embedding (KGE) to handle a huge number of entities. However, the
performance of KGE degrades without hyperparameters such as the margin term and
number of negative samples in NS loss being appropriately selected. Currently,
empirical hyperparameter tuning addresses this problem at the cost of
computational time. To solve this problem, we theoretically analyzed NS loss to
assist hyperparameter tuning and understand the better use of the NS loss in
KGE learning. Our theoretical analysis showed that scoring methods with
restricted value ranges, such as TransE and RotatE, require appropriate
adjustment of the margin term or the number of negative samples different from
those without restricted value ranges, such as RESCAL, ComplEx, and DistMult.
We also propose subsampling methods specialized for the NS loss in KGE studied
from a theoretical aspect. Our empirical analysis on the FB15k-237, WN18RR, and
YAGO3-10 datasets showed that the results of actually trained models agree with
our theoretical findings.
Related papers
- A Theoretical Analysis of Recommendation Loss Functions under Negative Sampling [13.180345241212423]
This paper conducts a comparative analysis of prevalent loss functions in Recommender Systems (RSs)
We show that Binary Cross-Entropy (BCE), Categorical Cross-Entropy (CCE), and Bayesian Personalized Ranking (BPR) are equivalent when one negative sample is used.
arXiv Detail & Related papers (2024-11-12T13:06:16Z) - Unified Interpretation of Smoothing Methods for Negative Sampling Loss Functions in Knowledge Graph Embedding [31.26112477399022]
This paper provides theoretical interpretations of the smoothing methods for the Negative Sampling (NS) loss in Knowledge Graphs (KGs)
It induces a new NS loss, Triplet Adaptive Negative Sampling (TANS), that can cover the characteristics of the conventional smoothing methods.
arXiv Detail & Related papers (2024-07-05T04:38:17Z) - Efficient kernel surrogates for neural network-based regression [0.8030359871216615]
We study the performance of the Conjugate Kernel (CK), an efficient approximation to the Neural Tangent Kernel (NTK)
We show that the CK performance is only marginally worse than that of the NTK and, in certain cases, is shown to be superior.
In addition to providing a theoretical grounding for using CKs instead of NTKs, our framework suggests a recipe for improving DNN accuracy inexpensively.
arXiv Detail & Related papers (2023-10-28T06:41:47Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Cram\'er-Rao bound-informed training of neural networks for quantitative
MRI [11.964144201247198]
Neural networks are increasingly used to estimate parameters in quantitative MRI, in particular in magnetic resonance fingerprinting.
Their advantages are their superior speed and their dominance of the non-efficient unbiased estimator.
We find, however, that heterogeneous parameters are hard to estimate.
We propose a well-founded Cram'erRao loss function, which normalizes the squared error with respective CRB.
arXiv Detail & Related papers (2021-09-22T06:38:03Z) - A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data.
In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem.
We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z) - RelWalk A Latent Variable Model Approach to Knowledge Graph Embedding [50.010601631982425]
This paper extends the random walk model (Arora et al., 2016a) of word embeddings to Knowledge Graph Embeddings (KGEs)
We derive a scoring function that evaluates the strength of a relation R between two entities h (head) and t (tail)
We propose a learning objective motivated by the theoretical analysis to learn KGEs from a given knowledge graph.
arXiv Detail & Related papers (2021-01-25T13:31:29Z) - Characterizing the loss landscape of variational quantum circuits [77.34726150561087]
We introduce a way to compute the Hessian of the loss function of VQCs.
We show how this information can be interpreted and compared to classical neural networks.
arXiv Detail & Related papers (2020-08-06T17:48:12Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.