Are Missing Links Predictable? An Inferential Benchmark for Knowledge
Graph Completion
- URL: http://arxiv.org/abs/2108.01387v1
- Date: Tue, 3 Aug 2021 09:51:15 GMT
- Title: Are Missing Links Predictable? An Inferential Benchmark for Knowledge
Graph Completion
- Authors: Yixin Cao, Kuang Jun, Ming Gao, Aoying Zhou, Yonggang Wen and Tat-Seng
Chua
- Abstract summary: InferWiki improves upon existing benchmarks in inferential ability, assumptions, and patterns.
Each testing sample is predictable with supportive data in the training set.
In experiments, we curate two settings of InferWiki varying in sizes and structures, and apply the construction process on CoDEx as comparative datasets.
- Score: 79.07695173192472
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present InferWiki, a Knowledge Graph Completion (KGC) dataset that
improves upon existing benchmarks in inferential ability, assumptions, and
patterns. First, each testing sample is predictable with supportive data in the
training set. To ensure it, we propose to utilize rule-guided train/test
generation, instead of conventional random split. Second, InferWiki initiates
the evaluation following the open-world assumption and improves the inferential
difficulty of the closed-world assumption, by providing manually annotated
negative and unknown triples. Third, we include various inference patterns
(e.g., reasoning path length and types) for comprehensive evaluation. In
experiments, we curate two settings of InferWiki varying in sizes and
structures, and apply the construction process on CoDEx as comparative
datasets. The results and empirical analyses demonstrate the necessity and
high-quality of InferWiki. Nevertheless, the performance gap among various
inferential assumptions and patterns presents the difficulty and inspires
future research direction. Our datasets can be found in
https://github.com/TaoMiner/inferwiki
Related papers
- Unified Pretraining for Recommendation via Task Hypergraphs [55.98773629788986]
We propose a novel multitask pretraining framework named Unified Pretraining for Recommendation via Task Hypergraphs.
For a unified learning pattern to handle diverse requirements and nuances of various pretext tasks, we design task hypergraphs to generalize pretext tasks to hyperedge prediction.
A novel transitional attention layer is devised to discriminatively learn the relevance between each pretext task and recommendation.
arXiv Detail & Related papers (2023-10-20T05:33:21Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Okapi: Generalising Better by Making Statistical Matches Match [7.392460712829188]
Okapi is a simple, efficient, and general method for robust semi-supervised learning based on online statistical matching.
Our method uses a nearest-neighbours-based matching procedure to generate cross-domain views for a consistency loss.
We show that it is in fact possible to leverage additional unlabelled data to improve upon empirical risk minimisation.
arXiv Detail & Related papers (2022-11-07T12:41:17Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - NOTE: Solution for KDD-CUP 2021 WikiKG90M-LSC [3.0716126507403545]
Recent representation learning methods have achieved great success on standard datasets like FB15k-237.
We train the advanced algorithms in different domains to learn the triplets, including OTE, QuatE, RotatE and TransE.
In addition to the representations, we also use various statistical probabilities among the head entities, the relations and the tail entities for the final prediction.
arXiv Detail & Related papers (2021-07-05T09:30:24Z) - Out-of-Vocabulary Entities in Link Prediction [1.9036571490366496]
Link prediction is often used as a proxy to evaluate the quality of embeddings.
As benchmarks are crucial for the fair comparison of algorithms, ensuring their quality is tantamount to providing a solid ground for developing better solutions.
We provide an implementation of an approach for spotting and removing such entities and provide corrected versions of the datasets WN18RR, FB15K-237, and YAGO3-10.
arXiv Detail & Related papers (2021-05-26T12:58:18Z) - CoDEx: A Comprehensive Knowledge Graph Completion Benchmark [16.454849794911084]
CoDEx is a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia.
CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples.
arXiv Detail & Related papers (2020-09-16T17:08:23Z) - Evaluating Models' Local Decision Boundaries via Contrast Sets [119.38387782979474]
We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data.
We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets.
Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets.
arXiv Detail & Related papers (2020-04-06T14:47:18Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.