BAND: Biomedical Alert News Dataset
- URL: http://arxiv.org/abs/2305.14480v2
- Date: Sun, 15 Oct 2023 15:09:24 GMT
- Title: BAND: Biomedical Alert News Dataset
- Authors: Zihao Fu, Meiru Zhang, Zaiqiao Meng, Yannan Shen, David Buckeridge,
Nigel Collier
- Abstract summary: We introduce the Biomedical Alert News dataset (BAND), which includes 1,508 samples from existing reported news articles, open emails, and alerts, as well as 30 epidemiology-related questions.
The BAND dataset brings new challenges to the NLP world, requiring better disguise capability of the content and the ability to infer important information.
To the best of our knowledge, the BAND corpus is the largest corpus of well-annotated biomedical outbreak alert news with elaborately designed questions.
- Score: 34.277782189514134
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Infectious disease outbreaks continue to pose a significant threat to human
health and well-being. To improve disease surveillance and understanding of
disease spread, several surveillance systems have been developed to monitor
daily news alerts and social media. However, existing systems lack thorough
epidemiological analysis in relation to corresponding alerts or news, largely
due to the scarcity of well-annotated reports data. To address this gap, we
introduce the Biomedical Alert News Dataset (BAND), which includes 1,508
samples from existing reported news articles, open emails, and alerts, as well
as 30 epidemiology-related questions. These questions necessitate the model's
expert reasoning abilities, thereby offering valuable insights into the
outbreak of the disease. The BAND dataset brings new challenges to the NLP
world, requiring better disguise capability of the content and the ability to
infer important information. We provide several benchmark tasks, including
Named Entity Recognition (NER), Question Answering (QA), and Event Extraction
(EE), to show how existing models are capable of handling these tasks in the
epidemiology domain. To the best of our knowledge, the BAND corpus is the
largest corpus of well-annotated biomedical outbreak alert news with
elaborately designed questions, making it a valuable resource for
epidemiologists and NLP researchers alike.
Related papers
- Disease Outbreak Detection and Forecasting: A Review of Methods and Data Sources [3.64584397341127]
Early detection and tracking of infectious disease outbreaks have the potential to reduce the mortality impact.
Many countries have implemented infectious disease surveillance systems, with the detection of epidemics being a primary objective.
The Internet and social media have become significant platforms where users share information about their preferences and relationships.
This article provides a review of the existing standard methods developed by researchers for detecting outbreaks using time series data.
arXiv Detail & Related papers (2024-10-21T16:20:06Z) - Event Detection from Social Media for Epidemic Prediction [76.90779562626541]
We develop a framework to extract and analyze epidemic-related events from social media posts.
Experimentation reveals how ED models trained on COVID-based SPEED can effectively detect epidemic events for three unseen epidemics.
We show that reporting sharp increases in the extracted events by our framework can provide warnings 4-9 weeks earlier than the WHO epidemic declaration for Monkeypox.
arXiv Detail & Related papers (2024-04-02T06:31:17Z) - Progress and Opportunities of Foundation Models in Bioinformatics [77.74411726471439]
Foundations models (FMs) have ushered in a new era in computational biology, especially in the realm of deep learning.
Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs.
Review analyses challenges and limitations faced by FMs in biology, such as data noise, model explainability, and potential biases.
arXiv Detail & Related papers (2024-02-06T02:29:17Z) - PHEE: A Dataset for Pharmacovigilance Event Extraction from Text [42.365919892504415]
PHEE is a novel dataset for pharmacovigilance comprising over 5000 annotated events from medical case reports and biomedical literature.
We describe the hierarchical event schema designed to provide coarse and fine-grained information about patients' demographics, treatments and (side) effects.
arXiv Detail & Related papers (2022-10-22T21:57:42Z) - When Infodemic Meets Epidemic: a Systematic Literature Review [3.3454373538792543]
Social media offer significant amounts of data that can be leveraged for bio-surveillance.
This systematic literature review provides a methodical overview of the integration of social media in different epidemic-related contexts.
arXiv Detail & Related papers (2022-10-03T21:04:30Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Digital Epidemiology: A review [0.0]
The epidemiology has recently witnessed great advances based on computational models.
Big Data along with apps are enabling for validating and refining models with real world data at scale.
Ebolas have to be approached from the lens of complexity as they require systemic solutions.
arXiv Detail & Related papers (2021-04-08T08:45:20Z) - Infusing Disease Knowledge into BERT for Health Question Answering,
Medical Inference and Disease Name Recognition [29.71396592575746]
We propose a new disease knowledge infusion training procedure and evaluate it on a suite of BERT models.
Experiments over the three tasks show that these models can be enhanced in nearly all cases.
arXiv Detail & Related papers (2020-10-08T03:14:38Z) - COVI White Paper [67.04578448931741]
Contact tracing is an essential tool to change the course of the Covid-19 pandemic.
We present an overview of the rationale, design, ethical considerations and privacy strategy of COVI,' a Covid-19 public peer-to-peer contact tracing and risk awareness mobile application developed in Canada.
arXiv Detail & Related papers (2020-05-18T07:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.