LLM2KB: Constructing Knowledge Bases using instruction tuned context
aware Large Language Models
- URL: http://arxiv.org/abs/2308.13207v1
- Date: Fri, 25 Aug 2023 07:04:16 GMT
- Title: LLM2KB: Constructing Knowledge Bases using instruction tuned context
aware Large Language Models
- Authors: Anmol Nayak and Hari Prasad Timmapathini
- Abstract summary: Our paper proposes LLM2KB, a system for constructing knowledge bases using large language models.
Our best performing model achieved an average F1 score of 0.6185 across 21 relations in the LM-KBC challenge held at the ISWC 2023 conference.
- Score: 0.8702432681310401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of Large Language Models (LLM) has revolutionized the field of
natural language processing, enabling significant progress in various
applications. One key area of interest is the construction of Knowledge Bases
(KB) using these powerful models. Knowledge bases serve as repositories of
structured information, facilitating information retrieval and inference tasks.
Our paper proposes LLM2KB, a system for constructing knowledge bases using
large language models, with a focus on the Llama 2 architecture and the
Wikipedia dataset. We perform parameter efficient instruction tuning for
Llama-2-13b-chat and StableBeluga-13B by training small injection models that
have only 0.05 % of the parameters of the base models using the Low Rank
Adaptation (LoRA) technique. These injection models have been trained with
prompts that are engineered to utilize Wikipedia page contexts of subject
entities fetched using a Dense Passage Retrieval (DPR) algorithm, to answer
relevant object entities for a given subject entity and relation. Our best
performing model achieved an average F1 score of 0.6185 across 21 relations in
the LM-KBC challenge held at the ISWC 2023 conference.
Related papers
- Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws [51.68385617116854]
Scaling laws describe the relationship between the size of language models and their capabilities.
We focus on factual knowledge represented as domains, such as (USA, capital, Washington D.C.) from a Wikipedia page.
A 7B model can store 14B bits of knowledge, surpassing the English Wikipedia and textbooks combined.
arXiv Detail & Related papers (2024-04-08T11:11:31Z) - StructLM: Towards Building Generalist Models for Structured Knowledge Grounding [49.10029030628653]
Large language models' (LLMs) ability to process structured data lags behind state-of-the-art (SoTA) model by an average of 35%.
We train a series of models, referred to as StructLM, based on the Mistral and the CodeLlama model family, ranging from 7B to 34B parameters.
Our StructLM series surpasses task-specific models on 16 out of 18 evaluated datasets and establishes new SoTA performance on 8 SKG tasks.
arXiv Detail & Related papers (2024-02-26T15:47:01Z) - Expanding the Vocabulary of BERT for Knowledge Base Construction [6.412048788884728]
"Knowledge Base Construction from Pretrained Language Models" challenge was held at International Semantic Web Conference 2023.
Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion.
We present Vocabulary Expandable BERT for knowledge base construction, which expand the language model's vocabulary while preserving semantic embeddings.
arXiv Detail & Related papers (2023-10-12T12:52:46Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context.
We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability.
Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - A Cohesive Distillation Architecture for Neural Language Models [0.0]
A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size.
This study investigates methods for Knowledge Distillation (KD) to provide efficient alternatives to large-scale models.
arXiv Detail & Related papers (2023-01-12T08:01:53Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - A Machine Learning Approach to Classifying Construction Cost Documents
into the International Construction Measurement Standard [0.0]
We introduce the first automated models for classifying natural language descriptions provided in cost documents called "Bills of Quantities"
We learn from a dataset of more than 50 thousand descriptions of items retrieved from 24 large infrastructure construction projects across the United Kingdom.
arXiv Detail & Related papers (2022-10-24T11:35:53Z) - Incorporating Linguistic Knowledge for Abstractive Multi-document
Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model.
We process the dependency information into the linguistic-guided attention mechanism.
With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.