General LLMs as Instructors for Domain-Specific LLMs: A Sequential Fusion Method to Integrate Extraction and Editing
- URL: http://arxiv.org/abs/2403.15736v2
- Date: Wed, 13 Nov 2024 14:05:18 GMT
- Title: General LLMs as Instructors for Domain-Specific LLMs: A Sequential Fusion Method to Integrate Extraction and Editing
- Authors: Xin Zhang, Tianjie Ju, Huijia Liang, Ying Fu, Qin Zhang,
- Abstract summary: We introduce a Sequential Fusion method to integrate knowledge from complex contexts into Large Language Models (LLMs)
Using our method, domain-specific LLMs achieved a 71.7% accuracy (an average gain of 39.1%) in question-answering tasks.
These findings underscore the effectiveness and flexibility of our approach in FDoR-UL across various domains.
- Score: 12.017822691367705
- License:
- Abstract: The substantial interest in updating Large Language Models (LLMs) without retraining from scratch is accompanied by several challenges. This is particularly true when updating LLMs with datasets that necessitate domain-expert reasoning across extensive texts, despite limited samples. We termed the scenario as the Few-Shot Domain-Expert Reasoning for Updating LLMs (FDoR-UL). Traditional methods such as Low-Rank Adaptation (LoRA) and Retrieval Augmented Generation (RAG) are inadequate for addressing this critical issue, particularly evident in our exploration of a specific medical dataset that epitomizes the distinct needs of FDoR-UL. To tackle this challenge, we introduce a Sequential Fusion method to integrate knowledge from complex contexts into LLMs. This method employs a two-stage framework: initially leveraging general LLMs to perform relation extraction for knowledge acquisition from complex texts, followed by updating domain-specific LLMs through Knowledge Editing (KE). Employing our method, domain-specific LLMs achieved a 71.7% accuracy (an average gain of 39.1%) in question-answering tasks. Furthermore, we expanded our evaluation to a novel economics-management dataset we developed, where our method achieved a 75.0% accuracy (an average gain of 45.0%). These findings underscore the effectiveness and flexibility of our approach in FDoR-UL across various domains.
Related papers
- Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement [51.601916604301685]
Large language models (LLMs) generate content that can undermine trust in online discourse.
Current methods often focus on binary classification, failing to address the complexities of real-world scenarios like human-AI collaboration.
To move beyond binary classification and address these challenges, we propose a new paradigm for detecting LLM-generated content.
arXiv Detail & Related papers (2024-10-18T08:14:10Z) - Efficient Reinforcement Learning with Large Language Model Priors [18.72288751305885]
Large language models (LLMs) have recently emerged as powerful general-purpose tools.
We propose treating LLMs as prior action distributions and integrating them into RL frameworks.
We show that incorporating LLM-based action priors significantly reduces exploration and complexity optimization.
arXiv Detail & Related papers (2024-10-10T13:54:11Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.
We devise a series of experiments to empirically explain the performance gap.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - A Practice-Friendly LLM-Enhanced Paradigm with Preference Parsing for Sequential Recommendation [15.153844486572932]
This paper proposes a practice-friendly LLM-enhanced paradigm with preference parsing (P2Rec) for sequential recommender systems (SRS)
Specifically, in the information reconstruction stage, we design a new user-level SFT task for collaborative information injection with the assistance of a pre-trained SRS model.
Our goal is to let LLM learn to reconstruct a corresponding prior preference distribution from each user's interaction sequence.
arXiv Detail & Related papers (2024-06-01T07:18:56Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement [79.31084387589968]
Pretrained large language models (LLMs) are currently state-of-the-art for solving the vast majority of natural language processing tasks.
We propose LLM2LLM, a data augmentation strategy that uses a teacher LLM to enhance a small seed dataset.
We achieve improvements up to 24.2% on the GSM8K dataset, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC and 39.8% on SST-2 over regular fine-tuning in the low-data regime.
arXiv Detail & Related papers (2024-03-22T08:57:07Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z) - Large Language Models as Data Preprocessors [9.99065004972981]
Large Language Models (LLMs) have marked a significant advancement in artificial intelligence.
This study explores their potential in data preprocessing, a critical stage in data mining and analytics applications.
We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques.
arXiv Detail & Related papers (2023-08-30T23:28:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.