A Large-Scale Dataset of Search Interests Related to Disease X
Originating from Different Geographic Regions
- URL: http://arxiv.org/abs/2312.11885v1
- Date: Tue, 19 Dec 2023 06:20:27 GMT
- Title: A Large-Scale Dataset of Search Interests Related to Disease X
Originating from Different Geographic Regions
- Authors: Nirmalya Thakur, Shuqi Cui, Kesha A. Patel, Isabella Hall, and Yuvraj
Nihal Duggal
- Abstract summary: This paper presents a dataset of web behavior related to Disease X between February 2018 and August 2023.
The dataset was developed by collecting data using Google Trends.
The relevant search interests for all these regions for each month in this time range are available in this dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The World Health Organization added Disease X to their shortlist of blueprint
priority diseases to represent a hypothetical, unknown pathogen that could
cause a future epidemic. During different virus outbreaks of the past, such as
COVID-19, Influenza, Lyme Disease, and Zika virus, researchers from various
disciplines utilized Google Trends to mine multimodal components of web
behavior to study, investigate, and analyze the global awareness, preparedness,
and response associated with these respective virus outbreaks. As the world
prepares for Disease X, a dataset on web behavior related to Disease X would be
crucial to contribute towards the timely advancement of research in this field.
Furthermore, none of the prior works in this field have focused on the
development of a dataset to compile relevant web behavior data, which would
help to prepare for Disease X. To address these research challenges, this work
presents a dataset of web behavior related to Disease X, which emerged from
different geographic regions of the world, between February 2018 and August
2023. Specifically, this dataset presents the search interests related to
Disease X from 94 geographic regions. The dataset was developed by collecting
data using Google Trends. The relevant search interests for all these regions
for each month in this time range are available in this dataset. This paper
also discusses the compliance of this dataset with the FAIR principles of
scientific data management. Finally, an analysis of this dataset is presented
to uphold the applicability, relevance, and usefulness of this dataset for the
investigation of different research questions in the interrelated fields of Big
Data, Data Mining, Healthcare, Epidemiology, and Data Analysis with a specific
focus on Disease X.
Related papers
- Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Modern Machine-Learning Predictive Models for Diagnosing Infectious
Diseases [0.0]
This paper reviewed research articles for recent machine-learning (ML) algorithms applied to infectious disease diagnosis.
We found that most of the articles used small datasets, and few of them used real-time data.
Our results demonstrated that a suitable ML technique depends on the nature of the dataset and the desired goal.
arXiv Detail & Related papers (2022-06-15T08:19:16Z) - Investigating the Relationship Between World Development Indicators and
the Occurrence of Disease Outbreaks in the 21st Century: A Case Study [0.0]
The timely identification of socio-economic sectors vulnerable to a disease outbreak presents an important challenge to the civic authorities.
We leverage data driven models to determine the relationship between the trends of World Development Indicators and occurrence of disease outbreaks.
arXiv Detail & Related papers (2021-09-20T06:31:03Z) - STOPPAGE: Spatio-temporal Data Driven Cloud-Fog-Edge Computing Framework
for Pandemic Monitoring and Management [28.205715426050105]
It is absolutely necessary to develop an analytics framework to deliver insights in improving administrative policy and enhance the preparedness to combat the pandemic.
This paper proposes a STOP-temporal knowledge mining framework, named STOP to model the impact of human mobility and contextual information over large geographic area in different temporal scales.
The framework has two modules: (i) S-temporal data and computing infrastructure using fog/edge based architecture; and (ii) S-temporal data analytics module to efficiently extract knowledge from heterogeneous data sources.
arXiv Detail & Related papers (2021-04-04T12:29:31Z) - Challenges in biomarker discovery and biorepository for Gulf-war-disease
studies: a novel data platform solution [48.7576911714538]
We introduce a novel data platform, named ROSALIND, to overcome the challenges, foster healthy and vital collaborations and advance scientific inquiries.
We follow the principles etched in the platform name - ROSALIND stands for resource organisms with self-governed accessibility, linkability, integrability, neutrality, and dependability.
The deployment of ROSALIND in our GWI study in recent 12 months has accelerated the pace of data experiment and analysis, removed numerous error sources, and increased research quality and productivity.
arXiv Detail & Related papers (2021-02-04T20:38:30Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Data Mining Approach to Analyze Covid19 Dataset of Brazilian Patients [0.0]
The pandemic originated by coronavirus(covid-19), name coined by World Health Organization during the first month in 2020.
Almost all the countries presented covid19 positive cases and governments are choosing different health policies to stop the infection.
One of top countries with more infections is Brazil, until August 11 had a total of 3,112,393 cases.
arXiv Detail & Related papers (2020-08-26T02:21:56Z) - Understanding the temporal evolution of COVID-19 research through
machine learning and natural language processing [66.63200823918429]
The outbreak of the novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been continuously affecting human lives and communities around the world.
We used multiple data sources, i.e., PubMed and ArXiv, and built several machine learning models to characterize the landscape of current COVID-19 research.
Our findings confirm the types of research available in PubMed and ArXiv differ significantly, with the former exhibiting greater diversity in terms of COVID-19 related issues.
arXiv Detail & Related papers (2020-07-22T18:02:39Z) - Cross-lingual Transfer Learning for COVID-19 Outbreak Alignment [90.12602012910465]
We train on Italy's early COVID-19 outbreak through Twitter and transfer to several other countries.
Our experiments show strong results with up to 0.85 Spearman correlation in cross-country predictions.
arXiv Detail & Related papers (2020-06-05T02:04:25Z) - A County-level Dataset for Informing the United States' Response to
COVID-19 [5.682299443164938]
We present a dataset that aggregates relevant data from governmental, journalistic, and academic sources on the U.S. county level.
Our dataset contains more than 300 variables that summarize population estimates, demographics, ethnicity, housing, education, employment and income, climate, transit, scores, and healthcare system-related metrics.
arXiv Detail & Related papers (2020-04-01T05:07:27Z) - Mapping the Landscape of Artificial Intelligence Applications against
COVID-19 [59.30734371401316]
COVID-19, the disease caused by the SARS-CoV-2 virus, has been declared a pandemic by the World Health Organization.
We present an overview of recent studies using Machine Learning and, more broadly, Artificial Intelligence to tackle many aspects of the COVID-19 crisis.
arXiv Detail & Related papers (2020-03-25T12:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.