Related papers: LLM2KB: Constructing Knowledge Bases using instruction tuned context aware Large Language Models

LLM2KB: Constructing Knowledge Bases using instruction tuned context aware Large Language Models

URL: http://arxiv.org/abs/2308.13207v1
Date: Fri, 25 Aug 2023 07:04:16 GMT
Title: LLM2KB: Constructing Knowledge Bases using instruction tuned context aware Large Language Models
Authors: Anmol Nayak and Hari Prasad Timmapathini
Abstract summary: Our paper proposes LLM2KB, a system for constructing knowledge bases using large language models. Our best performing model achieved an average F1 score of 0.6185 across 21 relations in the LM-KBC challenge held at the ISWC 2023 conference.
Score: 0.8702432681310401
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The advent of Large Language Models (LLM) has revolutionized the field of natural language processing, enabling significant progress in various applications. One key area of interest is the construction of Knowledge Bases (KB) using these powerful models. Knowledge bases serve as repositories of structured information, facilitating information retrieval and inference tasks. Our paper proposes LLM2KB, a system for constructing knowledge bases using large language models, with a focus on the Llama 2 architecture and the Wikipedia dataset. We perform parameter efficient instruction tuning for Llama-2-13b-chat and StableBeluga-13B by training small injection models that have only 0.05 % of the parameters of the base models using the Low Rank Adaptation (LoRA) technique. These injection models have been trained with prompts that are engineered to utilize Wikipedia page contexts of subject entities fetched using a Dense Passage Retrieval (DPR) algorithm, to answer relevant object entities for a given subject entity and relation. Our best performing model achieved an average F1 score of 0.6185 across 21 relations in the LM-KBC challenge held at the ISWC 2023 conference.

Related papers

Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation [81.18701211912779]
We introduce an Adaptive Multi-Aspect Retrieval-augmented over KGs (Amar) framework. This method retrieves knowledge including entities, relations, and subgraphs, and converts each piece of retrieved text into prompt embeddings. Our method has achieved state-of-the-art performance on two common datasets.
arXiv Detail & Related papers (2024-12-24T16:38:04Z)
KBLaM: Knowledge Base augmented Language Model [8.247901935078357]
We propose Knowledge Base augmented Language Model (KBLaM) for augmenting Large Language Models with external knowledge. KBLaM works with a knowledge base constructed from a corpus of documents, transforming each piece of knowledge in the KB into continuous key-value vector pairs. Experiments demonstrate KBLaM's effectiveness in various tasks, including question-answering and open-ended reasoning.
arXiv Detail & Related papers (2024-10-14T12:45:10Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
LBC: Language-Based-Classifier for Out-Of-Variable Generalization [14.033963471962823]
Large Language Models (LLMs) have great success in natural language processing tasks such as response generation. We find that the pre-trained knowledge of LLMs enables them to interpret new variables that appear in a test without additional training. We propose a Language-Based-Classifier (LBC) to maximize the benefits of LLMs to outperform TMLs on Out-of-Variable tasks.
arXiv Detail & Related papers (2024-08-20T15:05:02Z)
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting [15.69952375347308]
Language models have the ability to perform in-context learning (ICL) Despite their apparent ability to learn in-context, language models are known to struggle when faced with unseen or rarely seen tokens. We study structural in-context algorithms on both synthetic and naturalistic tasks using toy models, masked language models, and autoregressive language models.
arXiv Detail & Related papers (2024-05-28T21:38:20Z)
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws [51.68385617116854]
Scaling laws describe the relationship between the size of language models and their capabilities. We focus on factual knowledge represented as domains, such as (USA, capital, Washington D.C.) from a Wikipedia page. A 7B model can store 14B bits of knowledge, surpassing the English Wikipedia and textbooks combined.
arXiv Detail & Related papers (2024-04-08T11:11:31Z)
Expanding the Vocabulary of BERT for Knowledge Base Construction [6.412048788884728]
"Knowledge Base Construction from Pretrained Language Models" challenge was held at International Semantic Web Conference 2023. Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion. We present Vocabulary Expandable BERT for knowledge base construction, which expand the language model's vocabulary while preserving semantic embeddings.
arXiv Detail & Related papers (2023-10-12T12:52:46Z)
Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners. We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting. Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z)
Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context. We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability. Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z)
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM. For learning methods, we explore the claim of a "free lunch" hypothesis. For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP) What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)
A Machine Learning Approach to Classifying Construction Cost Documents into the International Construction Measurement Standard [0.0]
We introduce the first automated models for classifying natural language descriptions provided in cost documents called "Bills of Quantities" We learn from a dataset of more than 50 thousand descriptions of items retrieved from 24 large infrastructure construction projects across the United Kingdom.
arXiv Detail & Related papers (2022-10-24T11:35:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.