Artefact Retrieval: Overview of NLP Models with Knowledge Base Access
- URL: http://arxiv.org/abs/2201.09651v1
- Date: Mon, 24 Jan 2022 13:15:33 GMT
- Title: Artefact Retrieval: Overview of NLP Models with Knowledge Base Access
- Authors: Vil\'em Zouhar, Marius Mosbach, Debanjali Biswas, Dietrich Klakow
- Abstract summary: This paper systematically describes the typology of artefacts (items retrieved from a knowledge base), retrieval mechanisms and the way these artefacts are fused into the model.
Most of the focus is given to language models, though we also show how question answering, fact-checking and dialogue models fit into this system as well.
- Score: 18.098224374478598
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Many NLP models gain performance by having access to a knowledge base. A lot
of research has been devoted to devising and improving the way the knowledge
base is accessed and incorporated into the model, resulting in a number of
mechanisms and pipelines. Despite the diversity of proposed mechanisms, there
are patterns in the designs of such systems. In this paper, we systematically
describe the typology of artefacts (items retrieved from a knowledge base),
retrieval mechanisms and the way these artefacts are fused into the model. This
further allows us to uncover combinations of design decisions that had not yet
been tried. Most of the focus is given to language models, though we also show
how question answering, fact-checking and knowledgable dialogue models fit into
this system as well. Having an abstract model which can describe the
architecture of specific models also helps with transferring these
architectures between multiple NLP tasks.
Related papers
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - Mining Frequent Structures in Conceptual Models [2.841785306638839]
We propose a general approach to the problem of discovering frequent structures in conceptual modeling languages.
We use the combination of a frequent subgraph mining algorithm and graph manipulation techniques.
The primary objective is to offer a support facility for language engineers.
arXiv Detail & Related papers (2024-06-11T10:24:02Z) - LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer.
We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z) - Learn From Model Beyond Fine-Tuning: A Survey [78.80920533793595]
Learn From Model (LFM) focuses on the research, modification, and design of foundation models (FM) based on the model interface.
The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing.
This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM.
arXiv Detail & Related papers (2023-10-12T10:20:36Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - A model-agnostic approach for generating Saliency Maps to explain
inferred decisions of Deep Learning Models [2.741266294612776]
We propose a model-agnostic method for generating saliency maps that has access only to the output of the model.
We use Differential Evolution to identify which image pixels are the most influential in a model's decision-making process.
arXiv Detail & Related papers (2022-09-19T10:28:37Z) - Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks [53.09649785009528]
In this paper, we explore a paradigm that does not require training to obtain new models.
Similar to the birth of CNN inspired by receptive fields in the biological visual system, we propose Model Disassembling and Assembling.
For model assembling, we present the alignment padding strategy and parameter scaling strategy to construct a new model tailored for a specific task.
arXiv Detail & Related papers (2022-03-25T05:27:28Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - Combining pre-trained language models and structured knowledge [9.521634184008574]
transformer-based language models have achieved state of the art performance in various NLP benchmarks.
It has proven challenging to integrate structured information, such as knowledge graphs into these models.
We examine a variety of approaches to integrate structured knowledge into current language models and determine challenges, and possible opportunities to leverage both structured and unstructured information sources.
arXiv Detail & Related papers (2021-01-28T21:54:03Z) - Incorporating prior knowledge about structural constraints in model
identification [1.376408511310322]
We propose model identification techniques that could leverage such partial information to produce better estimates.
Specifically, we propose Structural Principal Component Analysis (SPCA) which improvises over existing methods like PCA.
The efficacy of the proposed approach is demonstrated using synthetic and industrial case-studies.
arXiv Detail & Related papers (2020-07-08T11:09:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.