MLMLM: Link Prediction with Mean Likelihood Masked Language Model
- URL: http://arxiv.org/abs/2009.07058v1
- Date: Tue, 15 Sep 2020 13:11:13 GMT
- Title: MLMLM: Link Prediction with Mean Likelihood Masked Language Model
- Authors: Louis Clouatre, Philippe Trempe, Amal Zouaq, Sarath Chandar
- Abstract summary: Knowledge Bases (KBs) are easy query, verifiable, and interpretable.
Masked Models (MLMs), such as BERT, scale with computing power as well as raw text data.
We introduce Mean Likelihood Masked Language Model, an approach comparing mean likelihood of generating different entities to perform link prediction.
- Score: 14.672283581769774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge Bases (KBs) are easy to query, verifiable, and interpretable. They
however scale with man-hours and high-quality data. Masked Language Models
(MLMs), such as BERT, scale with computing power as well as unstructured raw
text data. The knowledge contained within those models is however not directly
interpretable. We propose to perform link prediction with MLMs to address both
the KBs scalability issues and the MLMs interpretability issues. To do that we
introduce MLMLM, Mean Likelihood Masked Language Model, an approach comparing
the mean likelihood of generating the different entities to perform link
prediction in a tractable manner. We obtain State of the Art (SotA) results on
the WN18RR dataset and the best non-entity-embedding based results on the
FB15k-237 dataset. We also obtain convincing results on link prediction on
previously unseen entities, making MLMLM a suitable approach to introducing new
entities to a KB.
Related papers
- CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation [76.31621715032558]
Grounded generation aims to equip language models (LMs) with the ability to produce more credible and accountable responses.
We introduce CaLM, a novel verification framework.
Our framework empowers smaller LMs, which rely less on parametric memory, to validate the output of larger LMs.
arXiv Detail & Related papers (2024-06-08T06:04:55Z) - Harnessing Large Language Models as Post-hoc Correctors [6.288056740658763]
We show that an LLM can work as a post-hoc corrector to propose corrections for the predictions of an arbitrary Machine Learning model.
We form a contextual knowledge database by incorporating the dataset's label information and the ML model's predictions on the validation dataset.
Our experimental results on text analysis and the challenging molecular predictions show that model improves the performance of a number of models by up to 39%.
arXiv Detail & Related papers (2024-02-20T22:50:41Z) - Which Syntactic Capabilities Are Statistically Learned by Masked
Language Models for Code? [51.29970742152668]
We highlight relying on accuracy-based measurements may lead to an overestimation of models' capabilities.
To address these issues, we introduce a technique called SyntaxEval in Syntactic Capabilities.
arXiv Detail & Related papers (2024-01-03T02:44:02Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - MLLM-DataEngine: An Iterative Refinement Approach for MLLM [62.30753425449056]
We propose a novel closed-loop system that bridges data generation, model training, and evaluation.
Within each loop, the MLLM-DataEngine first analyze the weakness of the model based on the evaluation results.
For targeting, we propose an Adaptive Bad-case Sampling module, which adjusts the ratio of different types of data.
For quality, we resort to GPT-4 to generate high-quality data with each given data type.
arXiv Detail & Related papers (2023-08-25T01:41:04Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Mixture of Soft Prompts for Controllable Data Generation [21.84489422361048]
Mixture of Soft Prompts (MSP) is proposed as a tool for data augmentation rather than direct prediction.
Our method achieves state-of-the-art results on three benchmarks when compared against strong baselines.
arXiv Detail & Related papers (2023-03-02T21:13:56Z) - Inconsistencies in Masked Language Models [20.320583166619528]
Masked language models (MLMs) can provide distributions of tokens in the masked positions in a sequence.
distributions corresponding to different masking patterns can demonstrate considerable inconsistencies.
We propose an inference-time strategy for fors called Ensemble of Conditionals.
arXiv Detail & Related papers (2022-12-30T22:53:25Z) - Transcormer: Transformer for Sentence Scoring with Sliding Language
Modeling [95.9542389945259]
Sentence scoring aims at measuring the likelihood of a sentence and is widely used in many natural language processing scenarios.
We propose textitTranscormer -- a Transformer model with a novel textitsliding language modeling (SLM) for sentence scoring.
arXiv Detail & Related papers (2022-05-25T18:00:09Z) - Encoder-Decoder Models Can Benefit from Pre-trained Masked Language
Models in Grammatical Error Correction [54.569707226277735]
Previous methods have potential drawbacks when applied to an EncDec model.
Our proposed method fine-tune a corpus and then use the output fine-tuned as additional features in the GEC model.
The best-performing model state-of-the-art performances on the BEA 2019 and CoNLL-2014 benchmarks.
arXiv Detail & Related papers (2020-05-03T04:49:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.