Fine-tuning Large Enterprise Language Models via Ontological Reasoning
- URL: http://arxiv.org/abs/2306.10723v2
- Date: Mon, 18 Sep 2023 21:37:37 GMT
- Title: Fine-tuning Large Enterprise Language Models via Ontological Reasoning
- Authors: Teodoro Baldazzi, Luigi Bellomarini, Stefano Ceri, Andrea Colombo,
Andrea Gentili, Emanuel Sallinger
- Abstract summary: Large Language Models (LLMs) exploit fine-tuning as a technique to adapt to diverse goals, thanks to task-specific training data.
We propose a novel neurosymbolic architecture that leverages the power of ontological reasoning to build task- and domain-specific corpora for LLM fine-tuning.
- Score: 5.12835891233968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) exploit fine-tuning as a technique to adapt to
diverse goals, thanks to task-specific training data. Task specificity should
go hand in hand with domain orientation, that is, the specialization of an LLM
to accurately address the tasks of a given realm of interest. However, models
are usually fine-tuned over publicly available data or, at most, over ground
data from databases, ignoring business-level definitions and domain experience.
On the other hand, Enterprise Knowledge Graphs (EKGs) are able to capture and
augment such domain knowledge via ontological reasoning. With the goal of
combining LLM flexibility with the domain orientation of EKGs, we propose a
novel neurosymbolic architecture that leverages the power of ontological
reasoning to build task- and domain-specific corpora for LLM fine-tuning.
Related papers
- Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization [7.522493227357079]
Large Language Models (LLMs) are pre-trained on large-scale corpora.
LLMs suffer from hallucinations, knowledge cut-offs, and lack of knowledge attributions.
We introduce SMART-SLIC, a highly domain-specific LLM framework.
arXiv Detail & Related papers (2024-10-03T17:40:55Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.
We devise a series of experiments to empirically explain the performance gap.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - A New Pipeline For Generating Instruction Dataset via RAG and Self Fine-Tuning [0.0]
This research proposes a pipeline to construct high-quality instruction datasets for fine-tuning on specific domains.
By ingesting domain-specific documents, the pipeline generates relevant and contextually appropriate instructions.
As a case study, we apply this approach to the domain of psychiatry, a field requiring specialized knowledge and sensitive handling of patient information.
arXiv Detail & Related papers (2024-08-12T03:52:11Z) - ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition.
This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z) - PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models.
One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets.
We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA)
Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z) - GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding [39.67113788660731]
We introduce a framework for developing Graph-aligned LAnguage Models (GLaM)
We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning.
arXiv Detail & Related papers (2024-02-09T19:53:29Z) - EcomGPT-CT: Continual Pre-training of E-commerce Large Language Models
with Semi-structured Data [67.8302955948861]
Large Language Models (LLMs) pre-trained on massive corpora have exhibited remarkable performance on various NLP tasks.
Applying these models to specific domains still poses significant challenges, such as lack of domain knowledge.
We focus on domain-specific continual pre-training of LLMs using E-commerce domain as an exemplar.
arXiv Detail & Related papers (2023-12-25T11:31:47Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z) - KALA: Knowledge-Augmented Language Model Adaptation [65.92457495576141]
We propose a novel domain adaption framework for pre-trained language models (PLMs)
Knowledge-Augmented Language model Adaptation (KALA) modulates the intermediate hidden representations of PLMs with domain knowledge.
Results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training.
arXiv Detail & Related papers (2022-04-22T08:11:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.