Exploring Effects of Random Walk Based Minibatch Selection Policy on
Knowledge Graph Completion
- URL: http://arxiv.org/abs/2004.05553v1
- Date: Sun, 12 Apr 2020 06:16:57 GMT
- Title: Exploring Effects of Random Walk Based Minibatch Selection Policy on
Knowledge Graph Completion
- Authors: Bishal Santra, Prakhar Sharma, Sumegh Roychowdhury, Pawan Goyal
- Abstract summary: We propose a new random-walk based minibatch sampling technique for training KGC models.
We find that our proposed method achieves state-of-the-art performance on the DB100K dataset.
- Score: 11.484811954887432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we have explored the effects of different minibatch sampling
techniques in Knowledge Graph Completion. Knowledge Graph Completion (KGC) or
Link Prediction is the task of predicting missing facts in a knowledge graph.
KGC models are usually trained using margin, soft-margin or cross-entropy loss
function that promotes assigning a higher score or probability for true fact
triplets. Minibatch gradient descent is used to optimize these loss functions
for training the KGC models. But, as each minibatch consists of only a few
randomly sampled triplets from a large knowledge graph, any entity that occurs
in a minibatch, occurs only once in most cases. Because of this, these loss
functions ignore all other neighbors of any entity, whose embedding is being
updated at some minibatch step. In this paper, we propose a new random-walk
based minibatch sampling technique for training KGC models that optimizes the
loss incurred by a minibatch of closely connected subgraph of triplets instead
of randomly selected ones. We have shown results of experiments for different
models and datasets with our sampling technique and found that the proposed
sampling algorithm has varying effects on these datasets/models. Specifically,
we find that our proposed method achieves state-of-the-art performance on the
DB100K dataset.
Related papers
- Transductive Zero-Shot and Few-Shot CLIP [24.592841797020203]
This paper addresses the transductive zero-shot and few-shot CLIP classification challenge.
Inference is performed jointly across a mini-batch of unlabeled query samples, rather than treating each instance independently.
Our approach yields near 20% improvement in ImageNet accuracy over CLIP's zero-shot performance.
arXiv Detail & Related papers (2024-04-08T12:44:31Z) - Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls
and New Benchmarking [66.83273589348758]
Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph.
A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task.
New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
arXiv Detail & Related papers (2023-06-18T01:58:59Z) - BatchSampler: Sampling Mini-Batches for Contrastive Learning in Vision,
Language, and Graphs [37.378865860897285]
In-Batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close.
Recent studies aim to improve performance by sampling hard negatives textitwithin the current mini-batch.
We present BatchSampler to sample mini-batches of hard-to-distinguish (i.e., hard and true negatives to each other) instances.
arXiv Detail & Related papers (2023-06-06T02:13:27Z) - Training trajectories, mini-batch losses and the curious role of the
learning rate [13.848916053916618]
We show that validated gradient descent plays a fundamental role in nearly all applications of deep learning.
We propose a simple model and a geometric interpretation that allows to analyze the relationship between the gradients of mini-batches and the full batch.
In particular, a very low loss value can be reached just one step of descent with large enough learning rate.
arXiv Detail & Related papers (2023-01-05T21:58:46Z) - Learning Compact Features via In-Training Representation Alignment [19.273120635948363]
In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set.
We propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss.
We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning.
arXiv Detail & Related papers (2022-11-23T22:23:22Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - From Spectral Graph Convolutions to Large Scale Graph Convolutional
Networks [0.0]
Graph Convolutional Networks (GCNs) have been shown to be a powerful concept that has been successfully applied to a large variety of tasks.
We study the theory that paved the way to the definition of GCN, including related parts of classical graph theory.
arXiv Detail & Related papers (2022-07-12T16:57:08Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample.
We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.