Finetuning an LLM on Contextual Knowledge of Classics for Q&A
- URL: http://arxiv.org/abs/2312.07848v1
- Date: Wed, 13 Dec 2023 02:32:01 GMT
- Title: Finetuning an LLM on Contextual Knowledge of Classics for Q&A
- Authors: Shane Storm Strachan
- Abstract summary: This project is an attempt to merge the knowledge of Classics with the capabilities of artificial intelligence.
The goal of this project is to develop an LLM that not only reproduces contextual knowledge accurately but also exhibits a consistent "personality"
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The open-source publishing of large language models (LLMs) has created many
possibilities for how anyone who understands language and has access to a
computer can interact with significant tools of artificial intelligence,
particularly in the context of learning and knowledge dissemination. However,
the utility of these models in specialized fields like Classics is still
largely unexplored. This project is an attempt to merge the knowledge of
Classics with the capabilities of artificial intelligence by finetuning an LLM
to cater to the specific needs of learners and professionals. The goal of this
project is to develop an LLM that not only reproduces contextual knowledge
accurately but also exhibits a consistent "personality" - and, indeed, has
consistent propriety - to appeal to a diverse audience who possess differing
levels of knowledge. A significant portion of this project was dedicated to
refining the dataset, following the principle of "garbage in, garbage out," to
ensure the model generates relevant, useful, and creative responses when given
a prompt (a statement, question, or single word). After training and
evaluation, my model's ability to handle a vast array of different types of
inputs and prompting exceeded expectations for a 355M parameter model, though
its occasional hallucinations (especially when set with a high temperature),
particularly in its assertions about historical events or its own identity,
make it seem somewhat capricious and more work in the form of continuous
finetuning will be undertaken.
Related papers
- Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language [35.84181171987974]
Our goal is to build a regression model that can process numerical data and make probabilistic predictions at arbitrary locations.
We start by exploring strategies for eliciting explicit, coherent numerical predictive distributions from Large Language Models.
We demonstrate the ability to usefully incorporate text into numerical predictions, improving predictive performance and giving quantitative structure that reflects qualitative descriptions.
arXiv Detail & Related papers (2024-05-21T15:13:12Z) - Prompt-Time Symbolic Knowledge Capture with Large Language Models [0.0]
Augmenting large language models (LLMs) with user-specific knowledge is crucial for real-world applications, such as personal AI assistants.
This paper investigates utilizing the existing LLM capabilities to enable prompt-driven knowledge capture.
arXiv Detail & Related papers (2024-02-01T08:15:28Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - MechGPT, a language-based strategy for mechanics and materials modeling
that connects knowledge across scales, disciplines and modalities [0.0]
We use a Large Language Model (LLM) to distill question-answer pairs from raw sources followed by fine-tuning.
The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas.
arXiv Detail & Related papers (2023-10-16T14:29:35Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Generated Knowledge Prompting for Commonsense Reasoning [53.88983683513114]
We propose generating knowledge statements directly from a language model with a generic prompt format.
This approach improves performance of both off-the-shelf and finetuned language models on four commonsense reasoning tasks.
Notably, we find that a model's predictions can improve when using its own generated knowledge.
arXiv Detail & Related papers (2021-10-15T21:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.