Exploring Language Model Generalization in Low-Resource Extractive QA
- URL: http://arxiv.org/abs/2409.18446v1
- Date: Fri, 27 Sep 2024 05:06:43 GMT
- Title: Exploring Language Model Generalization in Low-Resource Extractive QA
- Authors: Saptarshi Sengupta, Wenpeng Yin, Preslav Nakov, Shreya Ghosh, Suhang Wang,
- Abstract summary: We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.
We devise a series of experiments to empirically explain the performance gap.
- Score: 57.14068405860034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift, i.e., can LLMs generalize well to closed-domains that require specific knowledge such as medicine and law in a zero-shot fashion without additional in-domain training? To this end, we devise a series of experiments to empirically explain the performance gap. Our findings suggest that: a) LLMs struggle with dataset demands of closed-domains such as retrieving long answer-spans; b) Certain LLMs, despite showing strong overall performance, display weaknesses in meeting basic requirements as discriminating between domain-specific senses of words which we link to pre-processing decisions; c) Scaling model parameters is not always effective for cross-domain generalization; and d) Closed-domain datasets are quantitatively much different than open-domain EQA datasets and current LLMs struggle to deal with them. Our findings point out important directions for improving existing LLMs.
Related papers
- Do LLMs Understand Ambiguity in Text? A Case Study in Open-world Question Answering [15.342415325821063]
Ambiguity in natural language poses significant challenges to Large Language Models (LLMs) used for open-domain question answering.
We compare off-the-shelf and few-shot LLM performance, focusing on measuring the impact of explicit disambiguation strategies.
We demonstrate how simple, training-free, token-level disambiguation methods may be effectively used to improve LLM performance for ambiguous question answering tasks.
arXiv Detail & Related papers (2024-11-19T10:27:26Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - General LLMs as Instructors for Domain-Specific LLMs: A Sequential Fusion Method to Integrate Extraction and Editing [12.017822691367705]
We introduce a Sequential Fusion method to integrate knowledge from complex contexts into Large Language Models (LLMs)
Using our method, domain-specific LLMs achieved a 71.7% accuracy (an average gain of 39.1%) in question-answering tasks.
These findings underscore the effectiveness and flexibility of our approach in FDoR-UL across various domains.
arXiv Detail & Related papers (2024-03-23T06:03:36Z) - PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models.
One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets.
We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA)
Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z) - On the Out-Of-Distribution Generalization of Multimodal Large Language
Models [24.431960338495184]
We investigate the generalization boundaries of current Multimodal Large Language Models (MLLMs)
We evaluate their zero-shot generalization across synthetic images, real-world distributional shifts, and specialized datasets like medical and molecular imagery.
We show that in-context learning can significantly enhance MLLMs' generalization, opening new avenues for overcoming generalization barriers.
arXiv Detail & Related papers (2024-02-09T18:21:51Z) - Adapt in Contexts: Retrieval-Augmented Domain Adaptation via In-Context
Learning [48.22913073217633]
Large language models (LLMs) have showcased their capability with few-shot inference known as in-context learning.
In this paper, we study the UDA problem under an in-context learning setting to adapt language models from the source domain to the target domain without any target labels.
We devise different prompting and training strategies, accounting for different LM architectures to learn the target distribution via language modeling.
arXiv Detail & Related papers (2023-11-20T06:06:20Z) - Knowledge Plugins: Enhancing Large Language Models for Domain-Specific
Recommendations [50.81844184210381]
We propose a general paradigm that augments large language models with DOmain-specific KnowledgE to enhance their performance on practical applications, namely DOKE.
This paradigm relies on a domain knowledge extractor, working in three steps: 1) preparing effective knowledge for the task; 2) selecting the knowledge for each specific sample; and 3) expressing the knowledge in an LLM-understandable way.
arXiv Detail & Related papers (2023-11-16T07:09:38Z) - Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [100.24095818099522]
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP)
They provide a highly useful, task-agnostic foundation for a wide range of applications.
However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles.
arXiv Detail & Related papers (2023-05-30T03:00:30Z) - Empower Large Language Model to Perform Better on Industrial
Domain-Specific Question Answering [36.31193273252256]
Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks.
But its performance in real industrial domain-specific scenarios is average due to its lack of specific domain knowledge.
We provide a benchmark Question Answering (QA) dataset named MSQA, centered around Microsoft products and IT technical problems encountered by customers.
arXiv Detail & Related papers (2023-05-19T09:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.