Related papers: MLMLM: Link Prediction with Mean Likelihood Masked Language Model

MLMLM: Link Prediction with Mean Likelihood Masked Language Model

URL: http://arxiv.org/abs/2009.07058v1
Date: Tue, 15 Sep 2020 13:11:13 GMT
Title: MLMLM: Link Prediction with Mean Likelihood Masked Language Model
Authors: Louis Clouatre, Philippe Trempe, Amal Zouaq, Sarath Chandar
Abstract summary: Knowledge Bases (KBs) are easy query, verifiable, and interpretable. Masked Models (MLMs), such as BERT, scale with computing power as well as raw text data. We introduce Mean Likelihood Masked Language Model, an approach comparing mean likelihood of generating different entities to perform link prediction.
Score: 14.672283581769774
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Knowledge Bases (KBs) are easy to query, verifiable, and interpretable. They however scale with man-hours and high-quality data. Masked Language Models (MLMs), such as BERT, scale with computing power as well as unstructured raw text data. The knowledge contained within those models is however not directly interpretable. We propose to perform link prediction with MLMs to address both the KBs scalability issues and the MLMs interpretability issues. To do that we introduce MLMLM, Mean Likelihood Masked Language Model, an approach comparing the mean likelihood of generating the different entities to perform link prediction in a tractable manner. We obtain State of the Art (SotA) results on the WN18RR dataset and the best non-entity-embedding based results on the FB15k-237 dataset. We also obtain convincing results on link prediction on previously unseen entities, making MLMLM a suitable approach to introducing new entities to a KB.

Related papers

MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM) MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task. LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z)
Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context. We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters. We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z)
CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation [76.31621715032558]
Grounded generation aims to equip language models (LMs) with the ability to produce more credible and accountable responses. We introduce CaLM, a novel verification framework. Our framework empowers smaller LMs, which rely less on parametric memory, to validate the output of larger LMs.
arXiv Detail & Related papers (2024-06-08T06:04:55Z)
Harnessing Large Language Models as Post-hoc Correctors [6.288056740658763]
We show that an LLM can work as a post-hoc corrector to propose corrections for the predictions of an arbitrary Machine Learning model. We form a contextual knowledge database by incorporating the dataset's label information and the ML model's predictions on the validation dataset. Our experimental results on text analysis and the challenging molecular predictions show that model improves the performance of a number of models by up to 39%.
arXiv Detail & Related papers (2024-02-20T22:50:41Z)
Which Syntactic Capabilities Are Statistically Learned by Masked Language Models for Code? [51.29970742152668]
We highlight relying on accuracy-based measurements may lead to an overestimation of models' capabilities. To address these issues, we introduce a technique called SyntaxEval in Syntactic Capabilities.
arXiv Detail & Related papers (2024-01-03T02:44:02Z)
LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths. LLMs can consistently outperform the SotA when the target text is large. Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z)
MLLM-DataEngine: An Iterative Refinement Approach for MLLM [62.30753425449056]
We propose a novel closed-loop system that bridges data generation, model training, and evaluation. Within each loop, the MLLM-DataEngine first analyze the weakness of the model based on the evaluation results. For targeting, we propose an Adaptive Bad-case Sampling module, which adjusts the ratio of different types of data. For quality, we resort to GPT-4 to generate high-quality data with each given data type.
arXiv Detail & Related papers (2023-08-25T01:41:04Z)
Evaluating and Explaining Large Language Models for Code Using Syntactic Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code. At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes. We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z)
Mixture of Soft Prompts for Controllable Data Generation [21.84489422361048]
Mixture of Soft Prompts (MSP) is proposed as a tool for data augmentation rather than direct prediction. Our method achieves state-of-the-art results on three benchmarks when compared against strong baselines.
arXiv Detail & Related papers (2023-03-02T21:13:56Z)
Inconsistencies in Masked Language Models [20.320583166619528]
Masked language models (MLMs) can provide distributions of tokens in the masked positions in a sequence. distributions corresponding to different masking patterns can demonstrate considerable inconsistencies. We propose an inference-time strategy for fors called Ensemble of Conditionals.
arXiv Detail & Related papers (2022-12-30T22:53:25Z)
Transcormer: Transformer for Sentence Scoring with Sliding Language Modeling [95.9542389945259]
Sentence scoring aims at measuring the likelihood of a sentence and is widely used in many natural language processing scenarios. We propose textitTranscormer -- a Transformer model with a novel textitsliding language modeling (SLM) for sentence scoring.
arXiv Detail & Related papers (2022-05-25T18:00:09Z)
Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction [54.569707226277735]
Previous methods have potential drawbacks when applied to an EncDec model. Our proposed method fine-tune a corpus and then use the output fine-tuned as additional features in the GEC model. The best-performing model state-of-the-art performances on the BEA 2019 and CoNLL-2014 benchmarks.
arXiv Detail & Related papers (2020-05-03T04:49:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.