Related papers: Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension

Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension

URL: http://arxiv.org/abs/2401.07977v1
Date: Mon, 15 Jan 2024 21:43:46 GMT
Title: Leveraging External Knowledge Resources to Enable Domain-Specific Comprehension
Authors: Saptarshi Sengupta, Connor Heaton, Prasenjit Mitra, Soumalya Sarkar
Abstract summary: Machine Reading (MRC) has been a long-standing problem in NLP. BERT variants trained on general text corpora are applied to domain-specific text. We introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models.
Score: 4.3905207721537804
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine Reading Comprehension (MRC) has been a long-standing problem in NLP and, with the recent introduction of the BERT family of transformer based language models, it has come a long way to getting solved. Unfortunately, however, when BERT variants trained on general text corpora are applied to domain-specific text, their performance inevitably degrades on account of the domain shift i.e. genre/subject matter discrepancy between the training and downstream application data. Knowledge graphs act as reservoirs for either open or closed domain information and prior studies have shown that they can be used to improve the performance of general-purpose transformers in domain-specific applications. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from knowledge graphs with the embeddings spaces of pre-trained language models (LMs). We fuse the aligned embeddings with open-domain LMs BERT and RoBERTa, and fine-tune them for two MRC tasks namely span detection (COVID-QA) and multiple-choice questions (PubMedQA). On the COVID-QA dataset, we see that our approach allows these models to perform similar to their domain-specific counterparts, Bio/Sci-BERT, as evidenced by the Exact Match (EM) metric. With regards to PubMedQA, we observe an overall improvement in accuracy while the F1 stays relatively the same over the domain-specific models.

Related papers

To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering [18.226545754007972]
This paper presents MedGENIE, the first generate-then-read framework for multiple-choice question answering in medicine. We conduct extensive experiments on MedQA-USMLE, MedMCQA, and MMLU, incorporating a practical perspective by assuming a maximum of 24GB VRAM. Our findings reveal that generated passages are more effective than retrieved ones in attaining higher accuracy.
arXiv Detail & Related papers (2024-03-04T10:41:52Z)
DG-TTA: Out-of-domain medical image segmentation through Domain Generalization and Test-Time Adaptation [43.842694540544194]
We propose to combine domain generalization and test-time adaptation to create a highly effective approach for reusing pre-trained models in unseen target domains. We demonstrate that our method, combined with pre-trained whole-body CT models, can effectively segment MR images with high accuracy.
arXiv Detail & Related papers (2023-12-11T10:26:21Z)
Enhancing Medical Specialty Assignment to Patients using NLP Techniques [0.0]
We propose an alternative approach that achieves superior performance while being computationally efficient. Specifically, we utilize keywords to train a deep learning architecture that outperforms a language model pretrained on a large corpus of text. Our results demonstrate that utilizing keywords for text classification significantly improves classification performance.
arXiv Detail & Related papers (2023-12-09T14:13:45Z)
A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest [62.63606958140248]
Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains. This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources. We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
arXiv Detail & Related papers (2023-11-17T16:09:10Z)
Quality > Quantity: Synthetic Corpora from Foundation Models for Closed-Domain Extractive Question Answering [35.38140071573828]
We study extractive question answering within closed domains and introduce the concept of targeted pre-training. Our proposed framework uses Galactica to generate synthetic, targeted'' corpora that align with specific writing styles and topics.
arXiv Detail & Related papers (2023-10-25T20:48:16Z)
Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models [42.360431316298204]
We focus on open-ended VQA and motivated by the recent advances in language models consider it as a generative task. To properly communicate the medical images to the language model, we develop a network that maps the extracted visual features to a set of learnable tokens. We evaluate our approach on the prime medical VQA benchmarks, namely, Slake, OVQA and PathVQA.
arXiv Detail & Related papers (2023-03-10T15:17:22Z)
Language Models sounds the Death Knell of Knowledge Graphs [0.0]
Deep Learning based NLP especially Large Language Models (LLMs) have found broad acceptance and are used extensively for many applications. BioBERT and Med-BERT are language models pre-trained for the healthcare domain. This paper argues that using Knowledge Graphs is not the best solution for solving problems in this domain.
arXiv Detail & Related papers (2023-01-10T14:20:15Z)
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods. We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z)
Open Domain Question Answering over Virtual Documents: A Unified Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA) Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources. We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z)
CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search [89.48123965553098]
This paper presents a search system to alleviate the special domain adaption problem. The system utilizes the domain-adaptive pretraining and few-shot learning technologies to help neural rankers mitigate the domain discrepancy. Our system performs the best among the non-manual runs in Round 2 of the TREC-COVID task.
arXiv Detail & Related papers (2020-11-03T09:10:48Z)
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing [85.35582118010608]
Task-oriented semantic parsing is a critical component of virtual assistants. Recent advances in deep learning have enabled several approaches to successfully parse more complex queries. We propose a novel method that outperforms a supervised neural model at a 10-fold data reduction.
arXiv Detail & Related papers (2020-10-07T17:47:53Z)
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)
Learning Contextualized Document Representations for Healthcare Answer Retrieval [68.02029435111193]
Contextual Discourse Vectors (CDV) is a distributed document representation for efficient answer retrieval from long documents. Our model leverages a dual encoder architecture with hierarchical LSTM layers and multi-task training to encode the position of clinical entities and aspects alongside the document discourse. We show that our generalized model significantly outperforms several state-of-the-art baselines for healthcare passage ranking.
arXiv Detail & Related papers (2020-02-03T15:47:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.