Related papers: Efficient, Simple and Automated Negative Sampling for Knowledge Graph Embedding

Efficient, Simple and Automated Negative Sampling for Knowledge Graph Embedding

URL: http://arxiv.org/abs/2010.14227v2
Date: Wed, 14 Jul 2021 01:52:43 GMT
Title: Efficient, Simple and Automated Negative Sampling for Knowledge Graph Embedding
Authors: Yongqi Zhang and Quanming Yao and Lei Chen
Abstract summary: Negative sampling, which samples negative triplets from non-observed ones in knowledge graph (KG), is an essential step in KG embedding. In this paper, motivated by the observation that negative triplets with large gradients are important but rare, we propose to directly keep track of them with the cache. Our method acts as a "distilled" version of previous GAN-based methods, which does not waste training time on additional parameters to fit the full distribution of negative triplets.
Score: 40.97648142355799
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Negative sampling, which samples negative triplets from non-observed ones in knowledge graph (KG), is an essential step in KG embedding. Recently, generative adversarial network (GAN), has been introduced in negative sampling. By sampling negative triplets with large gradients, these methods avoid the problem of vanishing gradient and thus obtain better performance. However, they make the original model more complex and harder to train. In this paper, motivated by the observation that negative triplets with large gradients are important but rare, we propose to directly keep track of them with the cache. In this way, our method acts as a "distilled" version of previous GAN-based methods, which does not waste training time on additional parameters to fit the full distribution of negative triplets. However, how to sample from and update the cache are two critical questions. We propose to solve these issues by automated machine learning techniques. The automated version also covers GAN-based methods as special cases. Theoretical explanation of NSCaching is also provided, justifying the superior over fixed sampling scheme. Besides, we further extend NSCaching with skip-gram model for graph embedding. Finally, extensive experiments show that our method can gain significant improvements on various KG embedding models and the skip-gram model, and outperforms the state-of-the-art negative sampling methods.

Related papers

DropoutGS: Dropping Out Gaussians for Better Sparse-view Rendering [45.785618745095164]
3D Gaussian Splatting (3DGS) has demonstrated promising results in novel view synthesis. As the number of training views decreases, the novel view synthesis task degrades to a highly under-determined problem. We propose a Random Dropout Regularization (RDR) to exploit the advantages of low-complexity models to alleviate overfitting. In addition, to remedy the lack of high-frequency details for these models, an Edge-guided Splitting Strategy (ESS) is developed.
arXiv Detail & Related papers (2025-04-13T09:17:21Z)
Bridging the Gap: Addressing Discrepancies in Diffusion Model Training for Classifier-Free Guidance [1.6804613362826175]
Diffusion models have emerged as a pivotal advancement in generative models. In this paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior. We introduce an updated loss function that better aligns training objectives with sampling behaviors.
arXiv Detail & Related papers (2023-11-02T02:03:12Z)
Gradient Surgery for One-shot Unlearning on Generative Model [0.989293617504294]
We introduce a simple yet effective approach to remove a data influence on the deep generative model. Inspired by works in multi-task learning, we propose to manipulate gradients to regularize the interplay of influence among samples.
arXiv Detail & Related papers (2023-07-10T13:29:23Z)
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance. Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z)
Post-Processing Temporal Action Detection [134.26292288193298]
Temporal Action Detection (TAD) methods typically take a pre-processing step in converting an input varying-length video into a fixed-length snippet representation sequence. This pre-processing step would temporally downsample the video, reducing the inference resolution and hampering the detection performance in the original temporal resolution. We introduce a novel model-agnostic post-processing method without model redesign and retraining.
arXiv Detail & Related papers (2022-11-27T19:50:37Z)
MixKG: Mixing for harder negative samples in knowledge graph [33.4379457065033]
Knowledge graph embedding(KGE) aims to represent entities and relations into low-dimensional vectors for many real-world applications. We introduce an inexpensive but effective method called MixKG to generate harder negative samples for knowledge graphs. Experiments on two public datasets and four classical KGE methods show MixKG is superior to previous negative sampling algorithms.
arXiv Detail & Related papers (2022-02-19T13:31:06Z)
SCE: Scalable Network Embedding from Sparsest Cut [20.08464038805681]
Large-scale network embedding is to learn a latent representation for each node in an unsupervised manner. A key of success to such contrastive learning methods is how to draw positive and negative samples. In this paper, we propose SCE for unsupervised network embedding only using negative samples for training.
arXiv Detail & Related papers (2020-06-30T03:18:15Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Exploring Effects of Random Walk Based Minibatch Selection Policy on Knowledge Graph Completion [11.484811954887432]
We propose a new random-walk based minibatch sampling technique for training KGC models. We find that our proposed method achieves state-of-the-art performance on the DB100K dataset.
arXiv Detail & Related papers (2020-04-12T06:16:57Z)
Reinforced Negative Sampling over Knowledge Graph for Recommendation [106.07209348727564]
We develop a new negative sampling model, Knowledge Graph Policy Network (kgPolicy), which works as a reinforcement learning agent to explore high-quality negatives. kgPolicy navigates from the target positive interaction, adaptively receives knowledge-aware negative signals, and ultimately yields a potential negative item to train the recommender.
arXiv Detail & Related papers (2020-03-12T12:44:30Z)
Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples [67.11669996924671]
We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm. When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic' We show that this top-k update' procedure is a generally applicable improvement.
arXiv Detail & Related papers (2020-02-14T19:27:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.