Leveraging External Knowledge Resources to Enable Domain-Specific
Comprehension
- URL: http://arxiv.org/abs/2401.07977v1
- Date: Mon, 15 Jan 2024 21:43:46 GMT
- Title: Leveraging External Knowledge Resources to Enable Domain-Specific
Comprehension
- Authors: Saptarshi Sengupta, Connor Heaton, Prasenjit Mitra, Soumalya Sarkar
- Abstract summary: Machine Reading (MRC) has been a long-standing problem in NLP.
BERT variants trained on general text corpora are applied to domain-specific text.
We introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models.
- Score: 4.3905207721537804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine Reading Comprehension (MRC) has been a long-standing problem in NLP
and, with the recent introduction of the BERT family of transformer based
language models, it has come a long way to getting solved. Unfortunately,
however, when BERT variants trained on general text corpora are applied to
domain-specific text, their performance inevitably degrades on account of the
domain shift i.e. genre/subject matter discrepancy between the training and
downstream application data. Knowledge graphs act as reservoirs for either open
or closed domain information and prior studies have shown that they can be used
to improve the performance of general-purpose transformers in domain-specific
applications. Building on existing work, we introduce a method using
Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings
extracted from knowledge graphs with the embeddings spaces of pre-trained
language models (LMs). We fuse the aligned embeddings with open-domain LMs BERT
and RoBERTa, and fine-tune them for two MRC tasks namely span detection
(COVID-QA) and multiple-choice questions (PubMedQA). On the COVID-QA dataset,
we see that our approach allows these models to perform similar to their
domain-specific counterparts, Bio/Sci-BERT, as evidenced by the Exact Match
(EM) metric. With regards to PubMedQA, we observe an overall improvement in
accuracy while the F1 stays relatively the same over the domain-specific
models.
Related papers
- To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering [18.226545754007972]
This paper presents MedGENIE, the first generate-then-read framework for multiple-choice question answering in medicine.
We conduct extensive experiments on MedQA-USMLE, MedMCQA, and MMLU, incorporating a practical perspective by assuming a maximum of 24GB VRAM.
Our findings reveal that generated passages are more effective than retrieved ones in attaining higher accuracy.
arXiv Detail & Related papers (2024-03-04T10:41:52Z) - DG-TTA: Out-of-domain medical image segmentation through Domain Generalization and Test-Time Adaptation [43.842694540544194]
We propose to combine domain generalization and test-time adaptation to create a highly effective approach for reusing pre-trained models in unseen target domains.
We demonstrate that our method, combined with pre-trained whole-body CT models, can effectively segment MR images with high accuracy.
arXiv Detail & Related papers (2023-12-11T10:26:21Z) - Enhancing Medical Specialty Assignment to Patients using NLP Techniques [0.0]
We propose an alternative approach that achieves superior performance while being computationally efficient.
Specifically, we utilize keywords to train a deep learning architecture that outperforms a language model pretrained on a large corpus of text.
Our results demonstrate that utilizing keywords for text classification significantly improves classification performance.
arXiv Detail & Related papers (2023-12-09T14:13:45Z) - A Self-enhancement Approach for Domain-specific Chatbot Training via
Knowledge Mining and Digest [62.63606958140248]
Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains.
This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources.
We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
arXiv Detail & Related papers (2023-11-17T16:09:10Z) - Quality > Quantity: Synthetic Corpora from Foundation Models for
Closed-Domain Extractive Question Answering [35.38140071573828]
We study extractive question answering within closed domains and introduce the concept of targeted pre-training.
Our proposed framework uses Galactica to generate synthetic, targeted'' corpora that align with specific writing styles and topics.
arXiv Detail & Related papers (2023-10-25T20:48:16Z) - Open-Ended Medical Visual Question Answering Through Prefix Tuning of
Language Models [42.360431316298204]
We focus on open-ended VQA and motivated by the recent advances in language models consider it as a generative task.
To properly communicate the medical images to the language model, we develop a network that maps the extracted visual features to a set of learnable tokens.
We evaluate our approach on the prime medical VQA benchmarks, namely, Slake, OVQA and PathVQA.
arXiv Detail & Related papers (2023-03-10T15:17:22Z) - Language Models sounds the Death Knell of Knowledge Graphs [0.0]
Deep Learning based NLP especially Large Language Models (LLMs) have found broad acceptance and are used extensively for many applications.
BioBERT and Med-BERT are language models pre-trained for the healthcare domain.
This paper argues that using Knowledge Graphs is not the best solution for solving problems in this domain.
arXiv Detail & Related papers (2023-01-10T14:20:15Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - Open Domain Question Answering over Virtual Documents: A Unified
Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA)
Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources.
We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z) - CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web
to Special Domain Search [89.48123965553098]
This paper presents a search system to alleviate the special domain adaption problem.
The system utilizes the domain-adaptive pretraining and few-shot learning technologies to help neural rankers mitigate the domain discrepancy.
Our system performs the best among the non-manual runs in Round 2 of the TREC-COVID task.
arXiv Detail & Related papers (2020-11-03T09:10:48Z) - Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic
Parsing [85.35582118010608]
Task-oriented semantic parsing is a critical component of virtual assistants.
Recent advances in deep learning have enabled several approaches to successfully parse more complex queries.
We propose a novel method that outperforms a supervised neural model at a 10-fold data reduction.
arXiv Detail & Related papers (2020-10-07T17:47:53Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z) - Learning Contextualized Document Representations for Healthcare Answer
Retrieval [68.02029435111193]
Contextual Discourse Vectors (CDV) is a distributed document representation for efficient answer retrieval from long documents.
Our model leverages a dual encoder architecture with hierarchical LSTM layers and multi-task training to encode the position of clinical entities and aspects alongside the document discourse.
We show that our generalized model significantly outperforms several state-of-the-art baselines for healthcare passage ranking.
arXiv Detail & Related papers (2020-02-03T15:47:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.