Query-Focused Extractive Summarisation for Finding Ideal Answers to
Biomedical and COVID-19 Questions
- URL: http://arxiv.org/abs/2108.12189v2
- Date: Tue, 31 Aug 2021 01:31:39 GMT
- Title: Query-Focused Extractive Summarisation for Finding Ideal Answers to
Biomedical and COVID-19 Questions
- Authors: Diego Moll\'a, Urvashi Khanna, Dima Galat, Vincent Nguyen, Maciej
Rybinski
- Abstract summary: Macquarie University participated in the BioASQ Synergy Task and BioASQ9b Phase B.
We used a query-focused summarisation system that was trained with the BioASQ8b training data set.
Considering the poor quality of the documents and snippets retrieved by our system, we observed reasonably good quality in the answers returned.
- Score: 7.6997148655751895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents Macquarie University's participation to the BioASQ
Synergy Task, and BioASQ9b Phase B. In each of these tasks, our participation
focused on the use of query-focused extractive summarisation to obtain the
ideal answers to medical questions. The Synergy Task is an end-to-end question
answering task on COVID-19 where systems are required to return relevant
documents, snippets, and answers to a given question. Given the absence of
training data, we used a query-focused summarisation system that was trained
with the BioASQ8b training data set and we experimented with methods to
retrieve the documents and snippets. Considering the poor quality of the
documents and snippets retrieved by our system, we observed reasonably good
quality in the answers returned. For phase B of the BioASQ9b task, the relevant
documents and snippets were already included in the test data. Our system split
the snippets into candidate sentences and used BERT variants under a sentence
classification setup. The system used the question and candidate sentence as
input and was trained to predict the likelihood of the candidate sentence being
part of the ideal answer. The runs obtained either the best or second best
ROUGE-F1 results of all participants to all batches of BioASQ9b. This shows
that using BERT in a classification setup is a very strong baseline for the
identification of ideal answers.
Related papers
- RAG-ConfusionQA: A Benchmark for Evaluating LLMs on Confusing Questions [52.33835101586687]
Conversational AI agents use Retrieval Augmented Generation (RAG) to provide verifiable document-grounded responses to user inquiries.
This paper presents a novel synthetic data generation method to efficiently create a diverse set of context-grounded confusing questions from a given document corpus.
arXiv Detail & Related papers (2024-10-18T16:11:29Z) - Using Pretrained Large Language Model with Prompt Engineering to Answer Biomedical Questions [1.0742675209112622]
We propose a two-level information retrieval and question-answering system based on pre-trained large language models (LLM)
We construct prompts with in-context few-shot examples and utilize post-processing techniques like resampling and malformed response detection.
Our best-performing system achieved 0.14 MAP score on document retrieval, 0.05 MAP score on snippet retrieval, 0.96 F1 score for yes/no questions, 0.38 MRR score for factoid questions and 0.50 F1 score for list questions in Task 12b.
arXiv Detail & Related papers (2024-07-09T11:48:49Z) - Selecting Query-bag as Pseudo Relevance Feedback for Information-seeking Conversations [76.70349332096693]
Information-seeking dialogue systems are widely used in e-commerce systems.
We propose a Query-bag based Pseudo Relevance Feedback framework (QB-PRF)
It constructs a query-bag with related queries to serve as pseudo signals to guide information-seeking conversations.
arXiv Detail & Related papers (2024-03-22T08:10:32Z) - Generating Natural Language Queries for More Effective Systematic Review
Screening Prioritisation [53.77226503675752]
The current state of the art uses the final title of the review as a query to rank the documents using BERT-based neural rankers.
In this paper, we explore alternative sources of queries for prioritising screening, such as the Boolean query used to retrieve the documents to be screened and queries generated by instruction-based large-scale language models such as ChatGPT and Alpaca.
Our best approach is not only viable based on the information available at the time of screening, but also has similar effectiveness to the final title.
arXiv Detail & Related papers (2023-09-11T05:12:14Z) - Contributions to the Improvement of Question Answering Systems in the
Biomedical Domain [0.951828574518325]
This thesis work falls within the framework of question answering (QA) in the biomedical domain.
We propose four contributions to improve the performance of QA in the biomedical domain.
We develop a fully automated semantic biomedical QA system called SemBioNLQA.
arXiv Detail & Related papers (2023-07-25T16:31:20Z) - Query-focused Extractive Summarisation for Biomedical and COVID-19
Complex Question Answering [0.0]
This paper presents Macquarie University's participation in the two most recent BioASQ Synergy Tasks.
We apply query-focused extractive summarisation techniques to generate complex answers to biomedical questions.
For the Synergy task, we selected the candidate sentences following two phases: document retrieval and snippet retrieval.
We observed an improvement of results when the system was trained on the second half of the BioASQ10b training data.
arXiv Detail & Related papers (2022-09-05T07:56:44Z) - Questions Are All You Need to Train a Dense Passage Retriever [123.13872383489172]
ART is a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
It uses a new document-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence documents, and (2) the documents are then used to compute the probability of reconstructing the original question.
arXiv Detail & Related papers (2022-06-21T18:16:31Z) - Transferability of Natural Language Inference to Biomedical Question
Answering [17.38537039378825]
We focus on applying BioBERT to transfer the knowledge of natural language inference (NLI) to biomedical question answering (QA)
We observe that BioBERT trained on the NLI dataset obtains better performance on Yes/No (+5.59%), Factoid (+0.53%), List type (+13.58%) questions.
We present a sequential transfer learning method that significantly performed well in the 8th BioASQ Challenge (Phase B)
arXiv Detail & Related papers (2020-07-01T04:05:48Z) - A Study on Efficiency, Accuracy and Document Structure for Answer
Sentence Selection [112.0514737686492]
In this paper, we argue that by exploiting the intrinsic structure of the original rank together with an effective word-relatedness encoder, we can achieve competitive results.
Our model takes 9.5 seconds to train on the WikiQA dataset, i.e., very fast in comparison with the $sim 18$ minutes required by a standard BERT-base fine-tuning.
arXiv Detail & Related papers (2020-03-04T22:12:18Z) - Pre-training Tasks for Embedding-based Large-scale Retrieval [68.01167604281578]
We consider the large-scale query-document retrieval problem.
Given a query (e.g., a question), return the set of relevant documents from a large document corpus.
We show that the key ingredient of learning a strong embedding-based Transformer model is the set of pre-training tasks.
arXiv Detail & Related papers (2020-02-10T16:44:00Z) - UNCC Biomedical Semantic Question Answering Systems. BioASQ: Task-7B,
Phase-B [1.976652238476722]
We present our approach for Task-7b, Phase B, Exact Answering Task.
These Question Answering (QA) tasks include Factoid, Yes/No, List Type Question answering.
Our system is based on a contextual word embedding model.
arXiv Detail & Related papers (2020-02-05T20:43:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.