Medical Literature Mining and Retrieval in a Conversational Setting
- URL: http://arxiv.org/abs/2108.01436v1
- Date: Fri, 23 Jul 2021 23:02:59 GMT
- Title: Medical Literature Mining and Retrieval in a Conversational Setting
- Authors: Souvik Das, Sougata Saha, and Rohini K. Srihari
- Abstract summary: Covid-19 pandemic has caused a spur in the medical research literature.
There is a need for robust text mining tools which can process, extract and present answers from the literature in a concise and consumable way.
We present a conversational system, which can retrieve and answer coronavirus-related queries from the rich medical literature, and present it in a conversational setting with the user.
- Score: 3.37411253119822
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Covid-19 pandemic has caused a spur in the medical research literature.
With new research advances in understanding the virus, there is a need for
robust text mining tools which can process, extract and present answers from
the literature in a concise and consumable way. With a DialoGPT based
multi-turn conversation generation module, and BM-25 \& neural embeddings based
ensemble information retrieval module, in this paper we present a
conversational system, which can retrieve and answer coronavirus-related
queries from the rich medical literature, and present it in a conversational
setting with the user. We further perform experiments to compare neural
embedding-based document retrieval and the traditional BM25 retrieval algorithm
and report the results.
Related papers
- Explainable Biomedical Hypothesis Generation via Retrieval Augmented Generation enabled Large Language Models [46.05020842978823]
Large Language Models (LLMs) have emerged as powerful tools to navigate this complex data landscape.
RAGGED is a comprehensive workflow designed to support investigators with knowledge integration and hypothesis generation.
arXiv Detail & Related papers (2024-07-17T07:44:18Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - Descriptive Knowledge Graph in Biomedical Domain [26.91431888505873]
We present a novel system that automatically extracts and generates informative and descriptive sentences from the biomedical corpus.
Unlike previous search engines or exploration systems that retrieve unconnected passages, our system organizes descriptive sentences as a graph.
We spotlight the application of our system in COVID-19 research, illustrating its utility in areas such as drug repurposing and literature curation.
arXiv Detail & Related papers (2023-10-18T03:10:25Z) - XrayGPT: Chest Radiographs Summarization using Medical Vision-Language
Models [60.437091462613544]
We introduce XrayGPT, a novel conversational medical vision-language model.
It can analyze and answer open-ended questions about chest radiographs.
We generate 217k interactive and high-quality summaries from free-text radiology reports.
arXiv Detail & Related papers (2023-06-13T17:59:59Z) - An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT [80.33783969507458]
The 'Impression' section of a radiology report is a critical basis for communication between radiologists and other physicians.
Recent studies have achieved promising results in automatic impression generation using large-scale medical text data.
These models often require substantial amounts of medical text data and have poor generalization performance.
arXiv Detail & Related papers (2023-04-17T17:13:42Z) - Prioritization of COVID-19-related literature via unsupervised keyphrase
extraction and document representation learning [1.8374319565577157]
The COVID-19 pandemic triggered a wave of novel scientific literature that is impossible to inspect and study in a reasonable time frame manually.
Current machine learning methods offer to project such body of literature into the vector space, where similar documents are located close to each other.
In our system, the current body of COVID-19-related literature is annotated using unsupervised keyphrase extraction.
The solution is accessible through a web server capable of interactive search, term ranking, and exploration of potentially interesting literature.
arXiv Detail & Related papers (2021-10-17T17:35:09Z) - Impact of detecting clinical trial elements in exploration of COVID-19
literature [29.027162080682643]
We compare the results retrieved by a standard search engine with those filtered using clinically-relevant concepts and their relations.
We find that the relational concept selection filters the original retrieved collection in a way that decreases the proportion of unjudged documents.
arXiv Detail & Related papers (2021-05-25T23:41:24Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z) - COVID-19 Literature Topic-Based Search via Hierarchical NMF [29.04869940568828]
A dataset of COVID-19-related scientific literature is compiled.
hierarchical nonnegative matrix factorization is used to organize literature related to the novel coronavirus into a tree structure.
arXiv Detail & Related papers (2020-09-07T05:45:03Z) - Automatic Text Summarization of COVID-19 Medical Research Articles using
BERT and GPT-2 [8.223517872575712]
We take advantage of the recent advances in pre-trained NLP models, BERT and OpenAI GPT-2.
Our model provides abstractive and comprehensive information based on keywords extracted from the original articles.
Our work can help the the medical community, by providing succinct summaries of articles for which the abstract are not already available.
arXiv Detail & Related papers (2020-06-03T00:54:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.