Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge
Graph Embedding Models Under a Unified Framework
- URL: http://arxiv.org/abs/2006.13365v5
- Date: Mon, 1 Nov 2021 16:26:39 GMT
- Title: Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge
Graph Embedding Models Under a Unified Framework
- Authors: Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue,
Mikhail Galkin, Sahand Sharifzadeh, Asja Fischer, Volker Tresp, Jens Lehmann
- Abstract summary: We re-implemented and evaluated 21 interaction models in the PyKEEN software package.
We performed a large-scale benchmarking on four datasets with several thousands of experiments and 24,804 GPU hours of time.
- Score: 31.35912529064612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The heterogeneity in recently published knowledge graph embedding models'
implementations, training, and evaluation has made fair and thorough
comparisons difficult. In order to assess the reproducibility of previously
published results, we re-implemented and evaluated 21 interaction models in the
PyKEEN software package. Here, we outline which results could be reproduced
with their reported hyper-parameters, which could only be reproduced with
alternate hyper-parameters, and which could not be reproduced at all as well as
provide insight as to why this might be the case.
We then performed a large-scale benchmarking on four datasets with several
thousands of experiments and 24,804 GPU hours of computation time. We present
insights gained as to best practices, best configurations for each model, and
where improvements could be made over previously published best configurations.
Our results highlight that the combination of model architecture, training
approach, loss function, and the explicit modeling of inverse relations is
crucial for a model's performances, and not only determined by the model
architecture. We provide evidence that several architectures can obtain results
competitive to the state-of-the-art when configured carefully. We have made all
code, experimental configurations, results, and analyses that lead to our
interpretations available at https://github.com/pykeen/pykeen and
https://github.com/pykeen/benchmarking
Related papers
- A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models.
Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning.
We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z) - Ensemble architecture in polyp segmentation [0.0]
This study explores the architecture of semantic segmentation and evaluated models that excel in polyp segmentation.
We present an integrated framework that harnesses the advantages of different models to attain an optimal outcome.
arXiv Detail & Related papers (2024-08-14T02:57:38Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models.
We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization.
We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z) - A Simple and efficient deep Scanpath Prediction [6.294759639481189]
We explore the efficiency of using common deep learning architectures, in a simple fully convolutional regressive manner.
We experiment how well these models can predict the scanpaths on 2 datasets.
We also compare the different leveraged backbone architectures based on their performances on the experiment to deduce which ones are the most suitable for the task.
arXiv Detail & Related papers (2021-12-08T22:43:45Z) - No One Representation to Rule Them All: Overlapping Features of Training
Methods [12.58238785151714]
High-performing models tend to make similar predictions regardless of training methodology.
Recent work has made very different training techniques, such as large-scale contrastive learning, yield competitively-high accuracy.
We show these models specialize in generalization of the data, leading to higher ensemble performance.
arXiv Detail & Related papers (2021-10-20T21:29:49Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z) - Tidying Deep Saliency Prediction Architectures [6.613005108411055]
In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions.
We propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks.
arXiv Detail & Related papers (2020-03-10T19:34:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.