Quality > Quantity: Synthetic Corpora from Foundation Models for
Closed-Domain Extractive Question Answering
- URL: http://arxiv.org/abs/2310.16995v1
- Date: Wed, 25 Oct 2023 20:48:16 GMT
- Title: Quality > Quantity: Synthetic Corpora from Foundation Models for
Closed-Domain Extractive Question Answering
- Authors: Saptarshi Sengupta, Connor Heaton, Shreya Ghosh, Preslav Nakov,
Prasenjit Mitra
- Abstract summary: We study extractive question answering within closed domains and introduce the concept of targeted pre-training.
Our proposed framework uses Galactica to generate synthetic, targeted'' corpora that align with specific writing styles and topics.
- Score: 35.38140071573828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Domain adaptation, the process of training a model in one domain and applying
it to another, has been extensively explored in machine learning. While
training a domain-specific foundation model (FM) from scratch is an option,
recent methods have focused on adapting pre-trained FMs for domain-specific
tasks. However, our experiments reveal that either approach does not
consistently achieve state-of-the-art (SOTA) results in the target domain. In
this work, we study extractive question answering within closed domains and
introduce the concept of targeted pre-training. This involves determining and
generating relevant data to further pre-train our models, as opposed to the
conventional philosophy of utilizing domain-specific FMs trained on a wide
range of data. Our proposed framework uses Galactica to generate synthetic,
``targeted'' corpora that align with specific writing styles and topics, such
as research papers and radiology reports. This process can be viewed as a form
of knowledge distillation. We apply our method to two biomedical extractive
question answering datasets, COVID-QA and RadQA, achieving a new benchmark on
the former and demonstrating overall improvements on the latter. Code available
at https://github.com/saptarshi059/CDQA-v1-Targetted-PreTraining/tree/main.
Related papers
- Adapting to Distribution Shift by Visual Domain Prompt Generation [34.19066857066073]
We adapt a model at test-time using a few unlabeled data to address distribution shifts.
We build a knowledge bank to learn the transferable knowledge from source domains.
The proposed method outperforms previous work on 5 large-scale benchmarks including WILDS and DomainNet.
arXiv Detail & Related papers (2024-05-05T02:44:04Z) - DG-TTA: Out-of-domain medical image segmentation through Domain Generalization and Test-Time Adaptation [43.842694540544194]
We propose to combine domain generalization and test-time adaptation to create a highly effective approach for reusing pre-trained models in unseen target domains.
We demonstrate that our method, combined with pre-trained whole-body CT models, can effectively segment MR images with high accuracy.
arXiv Detail & Related papers (2023-12-11T10:26:21Z) - AdAM: Few-Shot Image Generation via Adaptation-Aware Kernel Modulation [71.58154388819887]
Few-shot image generation (F SIG) aims to generate new and diverse images given few (e.g., 10) training samples.
Recent work has addressed F SIG by leveraging a GAN pre-trained on a large-scale source domain and adapting it to the target domain with few target samples.
We propose Adaptation-Aware kernel Modulation (AdAM) for general F SIG of different source-target domain proximity.
arXiv Detail & Related papers (2023-07-04T03:56:43Z) - Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on.
We propose a new approach called D$3$G to learn domain-specific models.
Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z) - Few-shot Image Generation via Adaptation-Aware Kernel Modulation [33.191479192580275]
Few-shot image generation (F SIG) aims to generate new and diverse samples given an extremely limited number of samples from a domain.
Recent work has addressed the problem using transfer learning approach, leveraging a GAN pretrained on a large-scale source domain dataset.
We propose Adaptation-Aware kernel Modulation (AdAM) to address general F SIG of different source-target domain proximity.
arXiv Detail & Related papers (2022-10-29T10:26:40Z) - Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from
Mixture-of-Experts [33.21435044949033]
Most existing methods perform training on multiple source domains using a single model.
We propose a novel framework for unsupervised test-time adaptation, which is formulated as a knowledge distillation process.
arXiv Detail & Related papers (2022-10-08T02:28:10Z) - Prior Knowledge Guided Unsupervised Domain Adaptation [82.9977759320565]
We propose a Knowledge-guided Unsupervised Domain Adaptation (KUDA) setting where prior knowledge about the target class distribution is available.
In particular, we consider two specific types of prior knowledge about the class distribution in the target domain: Unary Bound and Binary Relationship.
We propose a rectification module that uses such prior knowledge to refine model generated pseudo labels.
arXiv Detail & Related papers (2022-07-18T18:41:36Z) - Cross Domain Few-Shot Learning via Meta Adversarial Training [34.383449283927014]
Few-shot relation classification (RC) is one of the critical problems in machine learning.
We present a novel model that takes into consideration the afore-mentioned cross-domain situation.
A meta-based adversarial training framework is proposed to fine-tune the trained networks for adapting to data from the target domain.
arXiv Detail & Related papers (2022-02-11T15:52:29Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Meta-FDMixup: Cross-Domain Few-Shot Learning Guided by Labeled Target
Data [95.47859525676246]
A recent study finds that existing few-shot learning methods, trained on the source domain, fail to generalize to the novel target domain when a domain gap is observed.
In this paper, we realize that the labeled target data in Cross-Domain Few-Shot Learning has not been leveraged in any way to help the learning process.
arXiv Detail & Related papers (2021-07-26T06:15:45Z) - Source-Free Open Compound Domain Adaptation in Semantic Segmentation [99.82890571842603]
In SF-OCDA, only the source pre-trained model and the target data are available to learn the target model.
We propose the Cross-Patch Style Swap (CPSS) to diversify samples with various patch styles in the feature-level.
Our method produces state-of-the-art results on the C-Driving dataset.
arXiv Detail & Related papers (2021-06-07T08:38:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.