Enhancing Biomedical Text Summarization and Question-Answering: On the
Utility of Domain-Specific Pre-Training
- URL: http://arxiv.org/abs/2307.04412v1
- Date: Mon, 10 Jul 2023 08:32:45 GMT
- Title: Enhancing Biomedical Text Summarization and Question-Answering: On the
Utility of Domain-Specific Pre-Training
- Authors: Dima Galat, Marian-Andrei Rizoiu
- Abstract summary: We identify a suitable model architecture and use it to show a benefit of a general-domain pre-training followed by a task-specific fine-tuning.
Our results indicate that a Large Language Model without domain-specific pre-training can have a significant edge in some domain-specific biomedical text generation tasks.
- Score: 10.267057557137665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biomedical summarization requires large datasets to train for text
generation. We show that while transfer learning offers a viable option for
addressing this challenge, an in-domain pre-training does not always offer
advantages in a BioASQ summarization task. We identify a suitable model
architecture and use it to show a benefit of a general-domain pre-training
followed by a task-specific fine-tuning in the context of a BioASQ
summarization task, leading to a novel three-step fine-tuning approach that
works with only a thousand in-domain examples. Our results indicate that a
Large Language Model without domain-specific pre-training can have a
significant edge in some domain-specific biomedical text generation tasks.
Related papers
- Probabilistic Domain Adaptation for Biomedical Image Segmentation [2.5382095320488665]
We introduce a probabilistic domain adaptation method, building on self-training approaches and the Probabilistic UNet.
We study joint and separate source-target training strategies and evaluate our method on three challenging domain adaptation tasks for biomedical segmentation.
arXiv Detail & Related papers (2023-03-21T12:17:21Z) - BioBART: Pretraining and Evaluation of A Biomedical Generative Language
Model [1.1764594853212893]
In this work, we introduce the generative language model BioBART that adapts BART to the biomedical domain.
We collate various biomedical language generation tasks including dialogue, summarization, entity linking, and named entity recognition.
BioBART pretrained on PubMed abstracts has enhanced performance compared to BART and set strong baselines on several tasks.
arXiv Detail & Related papers (2022-04-08T08:07:42Z) - Slot Filling for Biomedical Information Extraction [0.5330240017302619]
We present a slot filling approach to the task of biomedical IE.
We follow the proposed paradigm of coupling a Tranformer-based bi-encoder, Dense Passage Retrieval, with a Transformer-based reader model.
arXiv Detail & Related papers (2021-09-17T14:16:00Z) - SciFive: a text-to-text transformer model for biomedical literature [0.9482369543628087]
We introduce SciFive, a domain-specific T5 model that has been pre-trained on large biomedical corpora.
Our results support the exploration of more difficult text generation tasks and the development of new methods in this area.
arXiv Detail & Related papers (2021-05-28T06:09:23Z) - FDMT: A Benchmark Dataset for Fine-grained Domain Adaptation in Machine
Translation [53.87731008029645]
We present a real-world fine-grained domain adaptation task in machine translation (FDMT)
The FDMT dataset consists of four sub-domains of information technology: autonomous vehicles, AI education, real-time networks and smart phone.
We make quantitative experiments and deep analyses in this new setting, which benchmarks the fine-grained domain adaptation task.
arXiv Detail & Related papers (2020-12-31T17:15:09Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z) - Pre-training via Paraphrasing [96.79972492585112]
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual paraphrasing objective.
We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization.
For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation.
arXiv Detail & Related papers (2020-06-26T14:43:43Z) - Enabling Language Models to Fill in the Blanks [81.59381915581892]
We present a simple approach for text infilling, the task of predicting missing spans of text at any position in a document.
We train (or fine-tune) off-the-shelf language models on sequences containing the concatenation of artificially-masked text and the text which was masked.
We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics.
arXiv Detail & Related papers (2020-05-11T18:00:03Z) - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks.
A second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains.
Adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining.
arXiv Detail & Related papers (2020-04-23T04:21:19Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.