Efficient, Simple and Automated Negative Sampling for Knowledge Graph
Embedding
- URL: http://arxiv.org/abs/2010.14227v2
- Date: Wed, 14 Jul 2021 01:52:43 GMT
- Title: Efficient, Simple and Automated Negative Sampling for Knowledge Graph
Embedding
- Authors: Yongqi Zhang and Quanming Yao and Lei Chen
- Abstract summary: Negative sampling, which samples negative triplets from non-observed ones in knowledge graph (KG), is an essential step in KG embedding.
In this paper, motivated by the observation that negative triplets with large gradients are important but rare, we propose to directly keep track of them with the cache.
Our method acts as a "distilled" version of previous GAN-based methods, which does not waste training time on additional parameters to fit the full distribution of negative triplets.
- Score: 40.97648142355799
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Negative sampling, which samples negative triplets from non-observed ones in
knowledge graph (KG), is an essential step in KG embedding. Recently,
generative adversarial network (GAN), has been introduced in negative sampling.
By sampling negative triplets with large gradients, these methods avoid the
problem of vanishing gradient and thus obtain better performance. However, they
make the original model more complex and harder to train. In this paper,
motivated by the observation that negative triplets with large gradients are
important but rare, we propose to directly keep track of them with the cache.
In this way, our method acts as a "distilled" version of previous GAN-based
methods, which does not waste training time on additional parameters to fit the
full distribution of negative triplets. However, how to sample from and update
the cache are two critical questions. We propose to solve these issues by
automated machine learning techniques. The automated version also covers
GAN-based methods as special cases. Theoretical explanation of NSCaching is
also provided, justifying the superior over fixed sampling scheme. Besides, we
further extend NSCaching with skip-gram model for graph embedding. Finally,
extensive experiments show that our method can gain significant improvements on
various KG embedding models and the skip-gram model, and outperforms the
state-of-the-art negative sampling methods.
Related papers
- Bridging the Gap: Addressing Discrepancies in Diffusion Model Training
for Classifier-Free Guidance [1.6804613362826175]
Diffusion models have emerged as a pivotal advancement in generative models.
In this paper we aim to underscore a discrepancy between conventional training methods and the desired conditional sampling behavior.
We introduce an updated loss function that better aligns training objectives with sampling behaviors.
arXiv Detail & Related papers (2023-11-02T02:03:12Z) - Gradient Surgery for One-shot Unlearning on Generative Model [0.989293617504294]
We introduce a simple yet effective approach to remove a data influence on the deep generative model.
Inspired by works in multi-task learning, we propose to manipulate gradients to regularize the interplay of influence among samples.
arXiv Detail & Related papers (2023-07-10T13:29:23Z) - Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance.
Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z) - Post-Processing Temporal Action Detection [134.26292288193298]
Temporal Action Detection (TAD) methods typically take a pre-processing step in converting an input varying-length video into a fixed-length snippet representation sequence.
This pre-processing step would temporally downsample the video, reducing the inference resolution and hampering the detection performance in the original temporal resolution.
We introduce a novel model-agnostic post-processing method without model redesign and retraining.
arXiv Detail & Related papers (2022-11-27T19:50:37Z) - MixKG: Mixing for harder negative samples in knowledge graph [33.4379457065033]
Knowledge graph embedding(KGE) aims to represent entities and relations into low-dimensional vectors for many real-world applications.
We introduce an inexpensive but effective method called MixKG to generate harder negative samples for knowledge graphs.
Experiments on two public datasets and four classical KGE methods show MixKG is superior to previous negative sampling algorithms.
arXiv Detail & Related papers (2022-02-19T13:31:06Z) - SCE: Scalable Network Embedding from Sparsest Cut [20.08464038805681]
Large-scale network embedding is to learn a latent representation for each node in an unsupervised manner.
A key of success to such contrastive learning methods is how to draw positive and negative samples.
In this paper, we propose SCE for unsupervised network embedding only using negative samples for training.
arXiv Detail & Related papers (2020-06-30T03:18:15Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Exploring Effects of Random Walk Based Minibatch Selection Policy on
Knowledge Graph Completion [11.484811954887432]
We propose a new random-walk based minibatch sampling technique for training KGC models.
We find that our proposed method achieves state-of-the-art performance on the DB100K dataset.
arXiv Detail & Related papers (2020-04-12T06:16:57Z) - Reinforced Negative Sampling over Knowledge Graph for Recommendation [106.07209348727564]
We develop a new negative sampling model, Knowledge Graph Policy Network (kgPolicy), which works as a reinforcement learning agent to explore high-quality negatives.
kgPolicy navigates from the target positive interaction, adaptively receives knowledge-aware negative signals, and ultimately yields a potential negative item to train the recommender.
arXiv Detail & Related papers (2020-03-12T12:44:30Z) - Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad
Samples [67.11669996924671]
We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm.
When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic'
We show that this top-k update' procedure is a generally applicable improvement.
arXiv Detail & Related papers (2020-02-14T19:27:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.