CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction
- URL: http://arxiv.org/abs/2205.08012v1
- Date: Mon, 16 May 2022 22:55:45 GMT
- Title: CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction
- Authors: Tara Safavi, Doug Downey, Tom Hope
- Abstract summary: We propose a tiered ranking architecture CascadER to maintain the ranking accuracy of full ensembling while improving efficiency considerably.
CascadER uses LMs to rerank the outputs of more efficient base KGEs, relying on an adaptive subset selection scheme aimed at invoking the LMs minimally while maximizing accuracy gain over the KGE.
Our empirical analyses reveal that diversity of models across modalities and preservation of individual models' confidence signals help explain the effectiveness of CascadER.
- Score: 22.96768147978534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge graph (KG) link prediction is a fundamental task in artificial
intelligence, with applications in natural language processing, information
retrieval, and biomedicine. Recently, promising results have been achieved by
leveraging cross-modal information in KGs, using ensembles that combine
knowledge graph embeddings (KGEs) and contextual language models (LMs).
However, existing ensembles are either (1) not consistently effective in terms
of ranking accuracy gains or (2) impractically inefficient on larger datasets
due to the combinatorial explosion problem of pairwise ranking with deep
language models. In this paper, we propose a novel tiered ranking architecture
CascadER to maintain the ranking accuracy of full ensembling while improving
efficiency considerably. CascadER uses LMs to rerank the outputs of more
efficient base KGEs, relying on an adaptive subset selection scheme aimed at
invoking the LMs minimally while maximizing accuracy gain over the KGE.
Extensive experiments demonstrate that CascadER improves MRR by up to 9 points
over KGE baselines, setting new state-of-the-art performance on four benchmarks
while improving efficiency by one or more orders of magnitude over competitive
cross-modal baselines. Our empirical analyses reveal that diversity of models
across modalities and preservation of individual models' confidence signals
help explain the effectiveness of CascadER, and suggest promising directions
for cross-modal cascaded architectures. Code and pretrained models are
available at https://github.com/tsafavi/cascader.
Related papers
- MERLOT: A Distilled LLM-based Mixture-of-Experts Framework for Scalable Encrypted Traffic Classification [19.476061046309052]
We present a scalable mixture-of-expert (MoE) based refinement of distilled large language model optimized for encrypted traffic classification.
Experiments on 10 datasets show superior or competitive performance over state-of-the-art models.
arXiv Detail & Related papers (2024-11-20T03:01:41Z) - Balancing Efficiency vs. Effectiveness and Providing Missing Label
Robustness in Multi-Label Stream Classification [3.97048491084787]
We propose a neural network-based approach to high-dimensional multi-label classification.
Our model uses a selective concept drift adaptation mechanism that makes it suitable for a non-stationary environment.
We adapt our model to an environment with missing labels using a simple yet effective imputation strategy.
arXiv Detail & Related papers (2023-10-01T13:23:37Z) - How to Turn Your Knowledge Graph Embeddings into Generative Models [10.466244652188777]
Some of the most successful knowledge graph embedding (KGE) models for link prediction can be interpreted as energy-based models.
This work re-interprets the score functions of these KGEs as circuits.
Our interpretation comes with little or no loss of performance for link prediction.
arXiv Detail & Related papers (2023-05-25T11:30:27Z) - Pretraining Without Attention [114.99187017618408]
This work explores pretraining without attention by using recent advances in sequence routing based on state-space models (SSMs)
BiGS is able to match BERT pretraining accuracy on GLUE and can be extended to long-form pretraining of 4096 tokens without approximation.
arXiv Detail & Related papers (2022-12-20T18:50:08Z) - Directed Acyclic Graph Factorization Machines for CTR Prediction via
Knowledge Distillation [65.62538699160085]
We propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation.
KD-DAGFM achieves the best performance with less than 21.5% FLOPs of the state-of-the-art method on both online and offline experiments.
arXiv Detail & Related papers (2022-11-21T03:09:42Z) - Explainable Sparse Knowledge Graph Completion via High-order Graph
Reasoning Network [111.67744771462873]
This paper proposes a novel explainable model for sparse Knowledge Graphs (KGs)
It combines high-order reasoning into a graph convolutional network, namely HoGRN.
It can not only improve the generalization ability to mitigate the information insufficiency issue but also provide interpretability.
arXiv Detail & Related papers (2022-07-14T10:16:56Z) - SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval [11.38022203865326]
SPLADE model provides highly sparse representations and competitive results with respect to state-of-the-art dense and sparse approaches.
We modify the pooling mechanism, benchmark a model solely based on document expansion, and introduce models trained with distillation.
Overall, SPLADE is considerably improved with more than $9$% gains on NDCG@10 on TREC DL 2019, leading to state-of-the-art results on the BEIR benchmark.
arXiv Detail & Related papers (2021-09-21T10:43:42Z) - AutoBERT-Zero: Evolving BERT Backbone from Scratch [94.89102524181986]
We propose an Operation-Priority Neural Architecture Search (OP-NAS) algorithm to automatically search for promising hybrid backbone architectures.
We optimize both the search algorithm and evaluation of candidate models to boost the efficiency of our proposed OP-NAS.
Experiments show that the searched architecture (named AutoBERT-Zero) significantly outperforms BERT and its variants of different model capacities in various downstream tasks.
arXiv Detail & Related papers (2021-07-15T16:46:01Z) - Deepened Graph Auto-Encoders Help Stabilize and Enhance Link Prediction [11.927046591097623]
Link prediction is a relatively under-studied graph learning task, with current state-of-the-art models based on one- or two-layers of shallow graph auto-encoder (GAE) architectures.
In this paper, we focus on addressing a limitation of current methods for link prediction, which can only use shallow GAEs and variational GAEs.
Our proposed methods innovatively incorporate standard auto-encoders (AEs) into the architectures of GAEs, where standard AEs are leveraged to learn essential, low-dimensional representations via seamlessly integrating the adjacency information and node features
arXiv Detail & Related papers (2021-03-21T14:43:10Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down
Distillation [153.56211546576978]
In this work, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator.
We can employ the meta-learning technique to optimize this label generator.
The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012.
arXiv Detail & Related papers (2020-08-27T13:04:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.