Statistical Analytics and Regional Representation Learning for COVID-19
Pandemic Understanding
- URL: http://arxiv.org/abs/2008.07342v1
- Date: Sat, 8 Aug 2020 03:35:16 GMT
- Title: Statistical Analytics and Regional Representation Learning for COVID-19
Pandemic Understanding
- Authors: Shayan Fazeli, Babak Moatamed, Majid Sarrafzadeh
- Abstract summary: The rapid spread of the novel coronavirus (COVID-19) has severely impacted almost all countries around the world.
This paper combines and processes an extensive collection of publicly available datasets to provide a unified information source.
A specific RNN-based inference pipeline called DoubleWindowLSTM-CP is proposed in this work for predictive event modeling.
- Score: 4.731074162093199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid spread of the novel coronavirus (COVID-19) has severely impacted
almost all countries around the world. It not only has caused a tremendous
burden on health-care providers to bear, but it has also brought severe impacts
on the economy and social life. The presence of reliable data and the results
of in-depth statistical analyses provide researchers and policymakers with
invaluable information to understand this pandemic and its growth pattern more
clearly. This paper combines and processes an extensive collection of publicly
available datasets to provide a unified information source for representing
geographical regions with regards to their pandemic-related behavior. The
features are grouped into various categories to account for their impact based
on the higher-level concepts associated with them. This work uses several
correlation analysis techniques to observe value and order relationships
between features, feature groups, and COVID-19 occurrences. Dimensionality
reduction techniques and projection methodologies are used to elaborate on
individual and group importance of these representative features. A specific
RNN-based inference pipeline called DoubleWindowLSTM-CP is proposed in this
work for predictive event modeling. It utilizes sequential patterns and enables
concise record representation while using but a minimal amount of historical
data. The quantitative results of our statistical analytics indicated critical
patterns reflecting on many of the expected collective behavior and their
associated outcomes. Predictive modeling with DoubleWindowLSTM-CP instance
exhibits efficient performance in quantitative and qualitative assessments
while reducing the need for extended and reliable historical information on the
pandemic.
Related papers
- Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study [39.70947911556511]
Existing forecasting models struggle with the multifaceted nature of relevant data and robust results translation.
Our work introduces PandemicLLM, a novel framework that reformulates real-time forecasting of disease spread as a text reasoning problem.
The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial, and epidemiological time series data.
arXiv Detail & Related papers (2024-04-10T12:22:03Z) - Cumulative Distribution Function based General Temporal Point Processes [49.758080415846884]
CuFun model represents a novel approach to TPPs that revolves around the Cumulative Distribution Function (CDF)
Our approach addresses several critical issues inherent in traditional TPP modeling.
Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction.
arXiv Detail & Related papers (2024-02-01T07:21:30Z) - Inference of Dependency Knowledge Graph for Electronic Health Records [13.35941801610195]
We propose a framework for deriving a sparse knowledge graph based on the dynamic log-linear topic model.
Within this model, the KG embeddings are estimated by performing singular value decomposition on the empirical pointwise mutual information matrix.
We then establish entrywise normality for the KG low-rank estimator, enabling the recovery of sparse graph edges with controlled type I error.
arXiv Detail & Related papers (2023-12-25T04:45:36Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Sensitivity, Performance, Robustness: Deconstructing the Effect of
Sociodemographic Prompting [64.80538055623842]
sociodemographic prompting is a technique that steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give.
We show that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks.
arXiv Detail & Related papers (2023-09-13T15:42:06Z) - Data-Centric Epidemic Forecasting: A Survey [56.99209141838794]
This survey delves into various data-driven methodological and practical advancements.
We enumerate the large number of epidemiological datasets and novel data streams that are relevant to epidemic forecasting.
We also discuss experiences and challenges that arise in real-world deployment of these forecasting systems.
arXiv Detail & Related papers (2022-07-19T16:15:11Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Combining Graph Neural Networks and Spatio-temporal Disease Models to
Predict COVID-19 Cases in Germany [0.0]
Several experts have called for the necessity to account for human mobility to explain the spread of COVID-19.
Most statistical or epidemiological models cannot directly incorporate unstructured data sources, including data that may encode human mobility.
We propose a trade-off between both research directions and present a novel learning approach that combines the advantages of statistical regression and machine learning models.
arXiv Detail & Related papers (2021-01-03T16:39:00Z) - Examining Deep Learning Models with Multiple Data Sources for COVID-19
Forecasting [10.052302234274256]
We design and analysis of deep learning-based models for COVID-19 forecasting.
We consider multiple sources such as COVID-19 confirmed and death case count data and testing data for better predictions.
We propose clustering-based training for high-temporal forecasting.
arXiv Detail & Related papers (2020-10-27T17:52:02Z) - DeepCOVIDNet: An Interpretable Deep Learning Model for Predictive
Surveillance of COVID-19 Using Heterogeneous Features and their Interactions [2.30238915794052]
We propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days.
Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties.
arXiv Detail & Related papers (2020-07-31T23:37:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.