Related papers: Knowledge Base Completion: Baseline strikes back (Again)

Knowledge Base Completion: Baseline strikes back (Again)

URL: http://arxiv.org/abs/2005.00804v3
Date: Mon, 25 Jul 2022 13:41:40 GMT
Title: Knowledge Base Completion: Baseline strikes back (Again)
Authors: Prachi Jain, Sushant Rathi, Mausam, Soumen Chakrabarti
Abstract summary: Knowledge Base Completion (KBC) has been a very active area lately. Recent developments allow us to use all available negative samples for training. We show that Complex, when trained using all available negative samples, gives near state-of-the-art performance on all the datasets.
Score: 36.52445566431404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge Base Completion (KBC) has been a very active area lately. Several recent KBCpapers propose architectural changes, new training methods, or even new formulations. KBC systems are usually evaluated on standard benchmark datasets: FB15k, FB15k-237, WN18, WN18RR, and Yago3-10. Most existing methods train with a small number of negative samples for each positive instance in these datasets to save computational costs. This paper discusses how recent developments allow us to use all available negative samples for training. We show that Complex, when trained using all available negative samples, gives near state-of-the-art performance on all the datasets. We call this approach COMPLEX-V2. We also highlight how various multiplicative KBC methods, recently proposed in the literature, benefit from this train-ing regime and become indistinguishable in terms of performance on most datasets. Our work calls for a reassessment of their individual value, in light of these findings.

Related papers

Rethinking Few Shot CLIP Benchmarks: A Critical Analysis in the Inductive Setting [26.843330914828503]
Several methods have shown improved performance of CLIP using few-shot examples.<n>We argue that this mode of evaluation does not provide a true indication of the inductive generalization ability.<n>We propose a pipeline that uses an unlearning technique to obtain true inductive baselines.
arXiv Detail & Related papers (2025-07-28T13:41:24Z)
Language Models Improve When Pretraining Data Matches Target Tasks [8.935657480912282]
BETR is a method that selects pretraining documents based on similarity to benchmark training examples.<n>We compare data selection methods by training over 500 models spanning $1019$ to $1022$ FLOPs and fitting scaling laws to them.<n>We find that BETR achieves a 2.1x compute multiplier over DCLM-Baseline and improves performance on 9 out of 10 tasks across all scales.
arXiv Detail & Related papers (2025-07-16T17:59:45Z)
Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining [57.352097333505476]
'Breaking the Batch Barrier' (B3) is a novel batch construction strategy designed to curate high-quality batches for Contrastive Learning (CL)<n>Our approach begins by using a pretrained teacher embedding model to rank all examples in the dataset.<n>A community detection algorithm is then applied to this graph to identify clusters of examples that serve as strong negatives for one another.<n>The clusters are then used to construct batches that are rich in in-batch negatives.
arXiv Detail & Related papers (2025-05-16T14:25:43Z)
TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks [30.922069185335246]
We find two common characteristics of tabular data in typical industrial applications that are underrepresented in the datasets usually used for evaluation in the literature. A considerable portion of datasets in production settings stem from extensive data acquisition and feature engineering pipelines. This can have an impact on the absolute and relative number of predictive, uninformative, and correlated features compared to academic datasets.
arXiv Detail & Related papers (2024-06-27T17:55:31Z)
Benchmarking Classical and Learning-Based Multibeam Point Cloud Registration [4.919017078893727]
In the underwater domain, most registration of multibeam echo-sounder (MBES) point cloud data are still performed using classical methods. In this work, we benchmark the performance of 2 classical and 4 learning-based methods. To the best of our knowledge, this is the first work to benchmark both learning-based and classical registration methods on an AUV-based MBES dataset.
arXiv Detail & Related papers (2024-05-10T07:23:33Z)
Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls and New Benchmarking [66.83273589348758]
Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph. A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task. New and diverse datasets have also been created to better evaluate the effectiveness of these new models.
arXiv Detail & Related papers (2023-06-18T01:58:59Z)
DataComp: In search of the next generation of multimodal datasets [179.79323076587255]
DataComp is a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl. Our benchmark consists of multiple compute scales spanning four orders of magnitude. In particular, our best baseline, DataComp-1B, enables training a CLIP ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet.
arXiv Detail & Related papers (2023-04-27T11:37:18Z)
Real-Time Evaluation in Online Continual Learning: A New Hope [104.53052316526546]
We evaluate current Continual Learning (CL) methods with respect to their computational costs. A simple baseline outperforms state-of-the-art CL methods under this evaluation. This surprisingly suggests that the majority of existing CL literature is tailored to a specific class of streams that is not practical.
arXiv Detail & Related papers (2023-02-02T12:21:10Z)
A Novel Dataset for Evaluating and Alleviating Domain Shift for Human Detection in Agricultural Fields [59.035813796601055]
We evaluate the impact of domain shift on human detection models trained on well known object detection datasets when deployed on data outside the distribution of the training set. We introduce the OpenDR Humans in Field dataset, collected in the context of agricultural robotics applications, using the Robotti platform.
arXiv Detail & Related papers (2022-09-27T07:04:28Z)
SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models [9.063614185765855]
In this paper, we introduce three types of negatives: in-batch negatives, pre-batch negatives, and self-negatives which act as a simple form of hard negatives. Our proposed model SimKGC can substantially outperform embedding-based methods on several benchmark datasets. In terms of mean reciprocal rank (MRR), we advance the state-of-the-art by +19% on WN18RR, +6.8% on the Wikidata5M transductive setting, and +22% on the Wikidata5M inductive setting.
arXiv Detail & Related papers (2022-03-04T07:36:30Z)
Exemplar-free Online Continual Learning [7.800379384628357]
Continual learning aims to learn new tasks from sequentially available data under the condition that each data is observed only once by the learner. Recent works have made remarkable achievements by storing part of learned task data as exemplars for knowledge replay. We propose a novel exemplar-free method by leveraging nearest-class-mean (NCM) classifier.
arXiv Detail & Related papers (2022-02-11T08:03:22Z)
A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics [70.45937234489044]
We re- organize two widely-used TSGV datasets (Charades-STA and ActivityNet Captions) to make it different from the training split. We introduce a new evaluation metric "dR@$n$,IoU@$m$" to calibrate the basic IoU scores. All the results demonstrate that the re-organized datasets and new metric can better monitor the progress in TSGV.
arXiv Detail & Related papers (2021-01-22T09:59:30Z)
BLEURT: Learning Robust Metrics for Text Generation [17.40369189981227]
We propose BLEURT, a learned evaluation metric based on BERT. A key aspect of our approach is a novel pre-training scheme that uses millions of synthetic examples to help the model generalize. BLEURT provides state-of-the-art results on the last three years of the WMT Metrics shared task and the WebNLG Competition dataset.
arXiv Detail & Related papers (2020-04-09T17:26:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.