Contributions to the Improvement of Question Answering Systems in the
Biomedical Domain
- URL: http://arxiv.org/abs/2307.13631v1
- Date: Tue, 25 Jul 2023 16:31:20 GMT
- Title: Contributions to the Improvement of Question Answering Systems in the
Biomedical Domain
- Authors: Mourad Sarrouti
- Abstract summary: This thesis work falls within the framework of question answering (QA) in the biomedical domain.
We propose four contributions to improve the performance of QA in the biomedical domain.
We develop a fully automated semantic biomedical QA system called SemBioNLQA.
- Score: 0.951828574518325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This thesis work falls within the framework of question answering (QA) in the
biomedical domain where several specific challenges are addressed, such as
specialized lexicons and terminologies, the types of treated questions, and the
characteristics of targeted documents. We are particularly interested in
studying and improving methods that aim at finding accurate and short answers
to biomedical natural language questions from a large scale of biomedical
textual documents in English. QA aims at providing inquirers with direct, short
and precise answers to their natural language questions. In this Ph.D. thesis,
we propose four contributions to improve the performance of QA in the
biomedical domain. In our first contribution, we propose a machine
learning-based method for question type classification to determine the types
of given questions which enable to a biomedical QA system to use the
appropriate answer extraction method. We also propose an another machine
learning-based method to assign one or more topics (e.g., pharmacological,
test, treatment, etc.) to given questions in order to determine the semantic
types of the expected answers which are very useful in generating specific
answer retrieval strategies. In the second contribution, we first propose a
document retrieval method to retrieve a set of relevant documents that are
likely to contain the answers to biomedical questions from the MEDLINE
database. We then present a passage retrieval method to retrieve a set of
relevant passages to questions. In the third contribution, we propose specific
answer extraction methods to generate both exact and ideal answers. Finally, in
the fourth contribution, we develop a fully automated semantic biomedical QA
system called SemBioNLQA which is able to deal with a variety of natural
language questions and to generate appropriate answers by providing both exact
and ideal answers.
Related papers
- How to Engage Your Readers? Generating Guiding Questions to Promote Active Reading [60.19226384241482]
We introduce GuidingQ, a dataset of 10K in-text questions from textbooks and scientific articles.
We explore various approaches to generate such questions using language models.
We conduct a human study to understand the implication of such questions on reading comprehension.
arXiv Detail & Related papers (2024-07-19T13:42:56Z) - Developing ChatGPT for Biology and Medicine: A Complete Review of
Biomedical Question Answering [25.569980942498347]
ChatGPT explores a strategic blueprint of question answering (QA) in delivering medical diagnosis, treatment recommendations, and other healthcare support.
This is achieved through the increasing incorporation of medical domain data via natural language processing (NLP) and multimodal paradigms.
arXiv Detail & Related papers (2024-01-15T07:21:16Z) - Generating Explanations in Medical Question-Answering by Expectation
Maximization Inference over Evidence [33.018873142559286]
We propose a novel approach for generating natural language explanations for answers predicted by medical QA systems.
Our system extract knowledge from medical textbooks to enhance the quality of explanations during the explanation generation process.
arXiv Detail & Related papers (2023-10-02T16:00:37Z) - Top K Relevant Passage Retrieval for Biomedical Question Answering [1.0636004442689055]
Question answering is a task that answers factoid questions using a large collection of documents.
The existing Dense Passage Retrieval model has been trained on Wikipedia dump from Dec. 20, 2018, as the source documents for answering questions.
In this work, we work on the existing DPR framework for the biomedical domain and retrieve answers from the Pubmed articles which is a reliable source to answer medical questions.
arXiv Detail & Related papers (2023-08-08T04:06:11Z) - LLaVA-Med: Training a Large Language-and-Vision Assistant for
Biomedicine in One Day [85.19963303642427]
We propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.
The model first learns to align biomedical vocabulary using the figure-caption pairs as is, then learns to master open-ended conversational semantics.
This enables us to train a Large Language and Vision Assistant for BioMedicine in less than 15 hours (with eight A100s)
arXiv Detail & Related papers (2023-06-01T16:50:07Z) - Medical Question Understanding and Answering with Knowledge Grounding
and Semantic Self-Supervision [53.692793122749414]
We introduce a medical question understanding and answering system with knowledge grounding and semantic self-supervision.
Our system is a pipeline that first summarizes a long, medical, user-written question, using a supervised summarization loss.
The system first matches the summarized user question with an FAQ from a trusted medical knowledge base, and then retrieves a fixed number of relevant sentences from the corresponding answer document.
arXiv Detail & Related papers (2022-09-30T08:20:32Z) - Medical Visual Question Answering: A Survey [55.53205317089564]
Medical Visual Question Answering(VQA) is a combination of medical artificial intelligence and popular VQA challenges.
Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer.
arXiv Detail & Related papers (2021-11-19T05:55:15Z) - Recent Advances in Automated Question Answering In Biomedical Domain [0.06922389632860546]
In the past few decades there has been a proliferation of acquisition of knowledge and consequently there has been an exponential growth in new scientific articles in the field of biomedicine.
It has become difficult to keep track of all the information in the domain, even for domain experts.
With the improvements in commercial search engines, users can type in their queries and get a small set of documents most relevant for answering their query.
This has necessitated the development of efficient QA systems which aim to find exact and precise answers to user provided natural language questions.
arXiv Detail & Related papers (2021-11-10T20:51:29Z) - A Dataset of Information-Seeking Questions and Answers Anchored in
Research Papers [66.11048565324468]
We present a dataset of 5,049 questions over 1,585 Natural Language Processing papers.
Each question is written by an NLP practitioner who read only the title and abstract of the corresponding paper, and the question seeks information present in the full text.
We find that existing models that do well on other QA tasks do not perform well on answering these questions, underperforming humans by at least 27 F1 points when answering them from entire papers.
arXiv Detail & Related papers (2021-05-07T00:12:34Z) - Biomedical Question Answering: A Comprehensive Review [19.38459023509541]
Question Answering (QA) is a benchmark Natural Language Processing (NLP) task where models predict the answer for a given question using related documents, images, knowledge bases and question-answer pairs.
For specific domains like biomedicine, QA systems are still rarely used in real-life settings.
Biomedical QA (BQA), as an emerging QA task, enables innovative applications to effectively perceive, access and understand complex biomedical knowledge.
arXiv Detail & Related papers (2021-02-10T06:16:35Z) - Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex
Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam.
These questions are the most challenging for current QA systems.
We present a Multi-step reasoning with Knowledge extraction framework (MurKe)
We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.