UPV at TREC Health Misinformation Track 2021 Ranking with SBERT and
Quality Estimators
- URL: http://arxiv.org/abs/2112.06080v1
- Date: Sat, 11 Dec 2021 21:57:57 GMT
- Title: UPV at TREC Health Misinformation Track 2021 Ranking with SBERT and
Quality Estimators
- Authors: Ipek Baris Schlicht and Angel Felipe Magnoss\~ao de Paula and Paolo
Rosso
- Abstract summary: We use a BM25 and a domain-specific semantic search engine for retrieving initial documents.
We examine a health news schema for quality assessment and apply it to re-rank documents.
- Score: 6.167830237917659
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Health misinformation on search engines is a significant problem that could
negatively affect individuals or public health. To mitigate the problem, TREC
organizes a health misinformation track. This paper presents our submissions to
this track. We use a BM25 and a domain-specific semantic search engine for
retrieving initial documents. Later, we examine a health news schema for
quality assessment and apply it to re-rank documents. We merge the scores from
the different components by using reciprocal rank fusion. Finally, we discuss
the results and conclude with future works.
Related papers
- Improving Health Question Answering with Reliable and Time-Aware Evidence Retrieval [5.69361786082969]
Our study focuses on the open-domain QA setting, where the key challenge is to first uncover relevant evidence in large knowledge bases.
By utilizing the common retrieve-then-read QA pipeline and PubMed as a trustworthy collection of medical research documents, we answer health questions from three diverse datasets.
Our results reveal that cutting down on the amount of retrieved documents and favoring more recent and highly cited documents can improve the final macro F1 score up to 10%.
arXiv Detail & Related papers (2024-04-12T09:56:12Z) - Machine Learning for Health symposium 2023 -- Findings track [16.654806183414976]
ML4H 2023 invited high-quality submissions on relevant problems in a variety of health-related disciplines.
Papers were targeted at mature work with strong technical sophistication and a high impact to health.
The Findings track looked for new ideas that could spark insightful discussion, serve as valuable resources for the community, or could enable new collaborations.
arXiv Detail & Related papers (2023-12-01T15:30:43Z) - Generating Natural Language Queries for More Effective Systematic Review
Screening Prioritisation [53.77226503675752]
The current state of the art uses the final title of the review as a query to rank the documents using BERT-based neural rankers.
In this paper, we explore alternative sources of queries for prioritising screening, such as the Boolean query used to retrieve the documents to be screened and queries generated by instruction-based large-scale language models such as ChatGPT and Alpaca.
Our best approach is not only viable based on the information available at the time of screening, but also has similar effectiveness to the final title.
arXiv Detail & Related papers (2023-09-11T05:12:14Z) - Incorporating Emotions into Health Mention Classification Task on Social
Media [70.23889100356091]
We present a framework for health mention classification that incorporates affective features.
We evaluate our approach on 5 HMC-related datasets from different social media platforms.
Our results indicate that HMC models infused with emotional knowledge are an effective alternative.
arXiv Detail & Related papers (2022-12-09T18:38:41Z) - RedHOT: A Corpus of Annotated Medical Questions, Experiences, and Claims
on Social Media [1.5293427903448022]
We present Reddit Health Online Talk (RedHOT), a corpus of 22,000 richly annotated social media posts from Reddit spanning 24 health conditions.
We mark snippets that describe patient Populations, Interventions, and Outcomes (PIO elements) within these claims.
We propose a new method to automatically derive (noisy) supervision for this task which we use to train a dense retrieval model.
arXiv Detail & Related papers (2022-10-12T15:50:32Z) - HealthE: Classifying Entities in Online Textual Health Advice [0.0]
We release a new annotated dataset, HealthE, consisting of 6,756 health advice.
HealthE has a more granular label space compared to existing medical NER corpora.
We introduce a new health entity classification model, EP S-BERT, which leverages textual context patterns in the classification of entity classes.
arXiv Detail & Related papers (2022-10-06T23:18:24Z) - Medical Question Understanding and Answering with Knowledge Grounding
and Semantic Self-Supervision [53.692793122749414]
We introduce a medical question understanding and answering system with knowledge grounding and semantic self-supervision.
Our system is a pipeline that first summarizes a long, medical, user-written question, using a supervised summarization loss.
The system first matches the summarized user question with an FAQ from a trusted medical knowledge base, and then retrieves a fixed number of relevant sentences from the corresponding answer document.
arXiv Detail & Related papers (2022-09-30T08:20:32Z) - ITTC @ TREC 2021 Clinical Trials Track [54.141379782822206]
The task focuses on the problem of matching eligible clinical trials to topics constituting a summary of a patient's admission notes.
We explore different ways of representing trials and topics using NLP techniques, and then use a common retrieval model to generate the ranked list of relevant trials for each topic.
The results from all our submitted runs are well above the median scores for all topics, but there is still plenty of scope for improvement.
arXiv Detail & Related papers (2022-02-16T04:56:47Z) - Mirror Matching: Document Matching Approach in Seed-driven Document
Ranking for Medical Systematic Reviews [31.3220495275256]
Document ranking is an approach for assisting researchers by providing document rankings where relevant documents are ranked higher than irrelevant ones.
We propose a document matching measure named Mirror Matching, which calculates matching scores between medical abstract texts by incorporating common writing patterns.
arXiv Detail & Related papers (2021-12-28T22:27:52Z) - An Analysis of a BERT Deep Learning Strategy on a Technology Assisted
Review Task [91.3755431537592]
Document screening is a central task within Evidenced Based Medicine.
I propose a DL document classification approach with BERT or PubMedBERT embeddings and a DL similarity search path.
I test and evaluate the retrieval effectiveness of my DL strategy on the 2017 and 2018 CLEF eHealth collections.
arXiv Detail & Related papers (2021-04-16T19:45:27Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.