Related papers: Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings

Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings

URL: http://arxiv.org/abs/2401.07977v3
Date: Fri, 13 Dec 2024 05:19:47 GMT
Title: Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings
Authors: Saptarshi Sengupta, Connor Heaton, Suhan Cui, Soumalya Sarkar, Prasenjit Mitra,
Abstract summary: In Natural Language Processing (NLP), Machine Reading (MRC) is the task of answering a question based on a given context.<n>To handle questions in the medical domain, modern language models such as BioBERT, SciBERT and even ChatGPT are trained on vast amounts of in-domain medical corpora.<n>We propose a resource-efficient approach for injecting domain knowledge into a model without relying on such domain-specific pre-training.
Score: 3.944219308229571
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In Natural Language Processing (NLP), Machine Reading Comprehension (MRC) is the task of answering a question based on a given context. To handle questions in the medical domain, modern language models such as BioBERT, SciBERT and even ChatGPT are trained on vast amounts of in-domain medical corpora. However, in-domain pre-training is expensive in terms of time and resources. In this paper, we propose a resource-efficient approach for injecting domain knowledge into a model without relying on such domain-specific pre-training. Knowledge graphs are powerful resources for accessing medical information. Building on existing work, we introduce a method using Multi-Layer Perceptrons (MLPs) for aligning and integrating embeddings extracted from medical knowledge graphs with the embedding spaces of pre-trained language models (LMs). The aligned embeddings are fused with open-domain LMs BERT and RoBERTa that are fine-tuned for two MRC tasks, span detection (COVID-QA) and multiple-choice questions (PubMedQA). We compare our method to prior techniques that rely on a vocabulary overlap for embedding alignment and show how our method circumvents this requirement to deliver better performance. On both datasets, our method allows BERT/RoBERTa to either perform on par (occasionally exceeding) with stronger domain-specific models or show improvements in general over prior techniques. With the proposed approach, we signal an alternative method to in-domain pre-training to achieve domain proficiency. Our code is available here.

Related papers

To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering [18.226545754007972]
This paper presents MedGENIE, the first generate-then-read framework for multiple-choice question answering in medicine. We conduct extensive experiments on MedQA-USMLE, MedMCQA, and MMLU, incorporating a practical perspective by assuming a maximum of 24GB VRAM. Our findings reveal that generated passages are more effective than retrieved ones in attaining higher accuracy.
arXiv Detail & Related papers (2024-03-04T10:41:52Z)
DG-TTA: Out-of-domain medical image segmentation through Domain Generalization and Test-Time Adaptation [43.842694540544194]
We propose to combine domain generalization and test-time adaptation to create a highly effective approach for reusing pre-trained models in unseen target domains. We demonstrate that our method, combined with pre-trained whole-body CT models, can effectively segment MR images with high accuracy.
arXiv Detail & Related papers (2023-12-11T10:26:21Z)
Enhancing Medical Specialty Assignment to Patients using NLP Techniques [0.0]
We propose an alternative approach that achieves superior performance while being computationally efficient. Specifically, we utilize keywords to train a deep learning architecture that outperforms a language model pretrained on a large corpus of text. Our results demonstrate that utilizing keywords for text classification significantly improves classification performance.
arXiv Detail & Related papers (2023-12-09T14:13:45Z)
A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest [62.63606958140248]
Large Language Models (LLMs) often encounter challenges when dealing with intricate and knowledge-demanding queries in specific domains. This paper introduces a novel approach to enhance LLMs by effectively extracting the relevant knowledge from domain-specific textual sources. We train a knowledge miner, namely LLMiner, which autonomously extracts Question-Answer pairs from relevant documents.
arXiv Detail & Related papers (2023-11-17T16:09:10Z)
Quality > Quantity: Synthetic Corpora from Foundation Models for Closed-Domain Extractive Question Answering [35.38140071573828]
We study extractive question answering within closed domains and introduce the concept of targeted pre-training. Our proposed framework uses Galactica to generate synthetic, targeted'' corpora that align with specific writing styles and topics.
arXiv Detail & Related papers (2023-10-25T20:48:16Z)
Open-Ended Medical Visual Question Answering Through Prefix Tuning of Language Models [42.360431316298204]
We focus on open-ended VQA and motivated by the recent advances in language models consider it as a generative task. To properly communicate the medical images to the language model, we develop a network that maps the extracted visual features to a set of learnable tokens. We evaluate our approach on the prime medical VQA benchmarks, namely, Slake, OVQA and PathVQA.
arXiv Detail & Related papers (2023-03-10T15:17:22Z)
Language Models sounds the Death Knell of Knowledge Graphs [0.0]
Deep Learning based NLP especially Large Language Models (LLMs) have found broad acceptance and are used extensively for many applications. BioBERT and Med-BERT are language models pre-trained for the healthcare domain. This paper argues that using Knowledge Graphs is not the best solution for solving problems in this domain.
arXiv Detail & Related papers (2023-01-10T14:20:15Z)
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods. We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z)
Open Domain Question Answering over Virtual Documents: A Unified Approach for Data and Text [62.489652395307914]
We use the data-to-text method as a means for encoding structured knowledge for knowledge-intensive applications, i.e. open-domain question answering (QA) Specifically, we propose a verbalizer-retriever-reader framework for open-domain QA over data and text where verbalized tables from Wikipedia and triples from Wikidata are used as augmented knowledge sources. We show that our Unified Data and Text QA, UDT-QA, can effectively benefit from the expanded knowledge index, leading to large gains over text-only baselines.
arXiv Detail & Related papers (2021-10-16T00:11:21Z)
CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search [89.48123965553098]
This paper presents a search system to alleviate the special domain adaption problem. The system utilizes the domain-adaptive pretraining and few-shot learning technologies to help neural rankers mitigate the domain discrepancy. Our system performs the best among the non-manual runs in Round 2 of the TREC-COVID task.
arXiv Detail & Related papers (2020-11-03T09:10:48Z)
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing [85.35582118010608]
Task-oriented semantic parsing is a critical component of virtual assistants. Recent advances in deep learning have enabled several approaches to successfully parse more complex queries. We propose a novel method that outperforms a supervised neural model at a 10-fold data reduction.
arXiv Detail & Related papers (2020-10-07T17:47:53Z)
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)
Learning Contextualized Document Representations for Healthcare Answer Retrieval [68.02029435111193]
Contextual Discourse Vectors (CDV) is a distributed document representation for efficient answer retrieval from long documents. Our model leverages a dual encoder architecture with hierarchical LSTM layers and multi-task training to encode the position of clinical entities and aspects alongside the document discourse. We show that our generalized model significantly outperforms several state-of-the-art baselines for healthcare passage ranking.
arXiv Detail & Related papers (2020-02-03T15:47:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.