NOTE: Solution for KDD-CUP 2021 WikiKG90M-LSC
- URL: http://arxiv.org/abs/2107.01892v1
- Date: Mon, 5 Jul 2021 09:30:24 GMT
- Title: NOTE: Solution for KDD-CUP 2021 WikiKG90M-LSC
- Authors: Weiyue Su, Zeyang Fang, Hui Zhong, Huijuan Wang, Siming Dai, Zhengjie
Huang, Yunsheng Shi, Shikun Feng, Zeyu Chen
- Abstract summary: Recent representation learning methods have achieved great success on standard datasets like FB15k-237.
We train the advanced algorithms in different domains to learn the triplets, including OTE, QuatE, RotatE and TransE.
In addition to the representations, we also use various statistical probabilities among the head entities, the relations and the tail entities for the final prediction.
- Score: 3.0716126507403545
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: WikiKG90M in KDD Cup 2021 is a large encyclopedic knowledge graph, which
could benefit various downstream applications such as question answering and
recommender systems. Participants are invited to complete the knowledge graph
by predicting missing triplets. Recent representation learning methods have
achieved great success on standard datasets like FB15k-237. Thus, we train the
advanced algorithms in different domains to learn the triplets, including OTE,
QuatE, RotatE and TransE. Significantly, we modified OTE into NOTE (short for
Norm-OTE) for better performance. Besides, we use both the DeepWalk and the
post-smoothing technique to capture the graph structure for supplementation. In
addition to the representations, we also use various statistical probabilities
among the head entities, the relations and the tail entities for the final
prediction. Experimental results show that the ensemble of state-of-the-art
representation learning methods could draw on each others strengths. And we
develop feature engineering from validation candidates for further
improvements. Please note that we apply the same strategy on the test set for
final inference. And these features may not be practical in the real world when
considering ranking against all the entities.
Related papers
- SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - A Simple and Scalable Graph Neural Network for Large Directed Graphs [11.792826520370774]
We investigate various combinations of node representations and edge direction awareness within an input graph.
In response, we propose a simple yet holistic classification method A2DUG.
We demonstrate that A2DUG stably performs well on various datasets and improves the accuracy up to 11.29 compared with the state-of-the-art methods.
arXiv Detail & Related papers (2023-06-14T06:24:58Z) - Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary
Data [100.33096338195723]
We focus on Few-shot Learning with Auxiliary Data (FLAD)
FLAD assumes access to auxiliary data during few-shot learning in hopes of improving generalization.
We propose two algorithms -- EXP3-FLAD and UCB1-FLAD -- and compare them with prior FLAD methods that either explore or exploit.
arXiv Detail & Related papers (2023-02-01T18:59:36Z) - Using Graph Algorithms to Pretrain Graph Completion Transformers [8.327657957422833]
Self-supervised pretraining can enhance performance on downstream graph, link, and node classification tasks.
We investigate five different pretraining signals, constructed using several graph algorithms and no external data, as well as their combination.
We propose a new path-finding algorithm guided by information gain and find that it is the best-performing pretraining task.
arXiv Detail & Related papers (2022-10-14T01:41:10Z) - Are Missing Links Predictable? An Inferential Benchmark for Knowledge
Graph Completion [79.07695173192472]
InferWiki improves upon existing benchmarks in inferential ability, assumptions, and patterns.
Each testing sample is predictable with supportive data in the training set.
In experiments, we curate two settings of InferWiki varying in sizes and structures, and apply the construction process on CoDEx as comparative datasets.
arXiv Detail & Related papers (2021-08-03T09:51:15Z) - Graph Convolution for Re-ranking in Person Re-identification [40.9727538382413]
We propose a graph-based re-ranking method to improve learned features while still keeping Euclidean distance as the similarity metric.
A simple yet effective method is proposed to generate a profile vector for each tracklet in videos, which helps extend our method to video re-ID.
arXiv Detail & Related papers (2021-07-05T18:40:43Z) - GRAD-MATCH: A Gradient Matching Based Data Subset Selection for
Efficient Learning [23.75284126177203]
We propose a general framework, GRAD-MATCH, which finds subsets that closely match the gradient of the training or validation set.
We show that GRAD-MATCH significantly and consistently outperforms several recent data-selection algorithms.
arXiv Detail & Related papers (2021-02-27T04:09:32Z) - Learning Reasoning Strategies in End-to-End Differentiable Proving [50.9791149533921]
Conditional Theorem Provers learn optimal rule selection strategy via gradient-based optimisation.
We show that Conditional Theorem Provers are scalable and yield state-of-the-art results on the CLUTRR dataset.
arXiv Detail & Related papers (2020-07-13T16:22:14Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.