A windowed correlation based feature selection method to improve time
series prediction of dengue fever cases
- URL: http://arxiv.org/abs/2104.10289v1
- Date: Wed, 21 Apr 2021 00:28:28 GMT
- Title: A windowed correlation based feature selection method to improve time
series prediction of dengue fever cases
- Authors: Tanvir Ferdousi, Lee W. Cohnstaedt, and Caterina M. Scoglio
- Abstract summary: Poor performance in prediction can result in places with inadequate data.
New framework is presented for windowing incidence data and computing time-shifted correlation-based metrics.
Recurrent neural network-based prediction models achieve up to 33.6% accuracy improvement on average.
- Score: 0.20072624123275526
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance of data-driven prediction models depends on the availability
of data samples for model training. A model that learns about dengue fever
incidence in a population uses historical data from that corresponding
location. Poor performance in prediction can result in places with inadequate
data. This work aims to enhance temporally limited dengue case data by
methodological addition of epidemically relevant data from nearby locations as
predictors (features). A novel framework is presented for windowing incidence
data and computing time-shifted correlation-based metrics to quantify feature
relevance. The framework ranks incidence data of adjacent locations around a
target location by combining the correlation metric with two other metrics:
spatial distance and local prevalence. Recurrent neural network-based
prediction models achieve up to 33.6% accuracy improvement on average using the
proposed method compared to using training data from the target location only.
These models achieved mean absolute error (MAE) values as low as 0.128 on [0,1]
normalized incidence data for a municipality with the highest dengue prevalence
in Brazil's Espirito Santo. When predicting cases aggregated over geographical
ecoregions, the models achieved accuracy improvements up to 16.5%, using only
6.5% of incidence data from ranked feature sets. The paper also includes two
techniques for windowing time series data: fixed-sized windows and outbreak
detection windows. Both of these techniques perform comparably, while the
window detection method uses less data for computations. The framework
presented in this paper is application-independent, and it could improve the
performances of prediction models where data from spatially adjacent locations
are available.
Related papers
- Joint Prediction Regions for time-series models [0.0]
It is an easy task to compute Joint Prediction regions (JPR) when the data is IID.
This project aims to implement Wolf and Wunderli's method for constructing JPRs and compare it with other methods.
arXiv Detail & Related papers (2024-05-14T02:38:49Z) - Spatial-temporal Forecasting for Regions without Observations [13.805203053973772]
We study spatial-temporal forecasting for a region of interest without any historical observations.
We propose a model named STSM for the task.
Our key insight is to learn from the locations that resemble those in the region of interest.
arXiv Detail & Related papers (2024-01-19T06:26:05Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Multi-time Predictions of Wildfire Grid Map using Remote Sensing Local
Data [0.0]
This paper proposes a distributed learning framework that shares local data collected in ten locations in the western USA throughout local agents.
The proposed model has distinct features that address the characteristic need in prediction evaluations, including dynamic online estimation and time-series modeling.
arXiv Detail & Related papers (2022-09-15T22:34:06Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Heterogeneous Data Fusion Considering Spatial Correlations using Graph
Convolutional Networks and its Application in Air Quality Prediction [4.9960351797442515]
This paper proposes a deep learning method for fusing heterogeneous collected data from multiple monitoring points using graph convolutional networks (GCNs)
In the application scenario of air quality prediction, it is observed that the fused data derived from the RBF-based fusion approach achieve satisfactory consistency.
The proposed method is applicable for similar scenarios where continuous heterogeneous data are collected from multiple monitoring points scattered across a study area.
arXiv Detail & Related papers (2021-05-24T15:57:31Z) - Predicting traffic signals on transportation networks using
spatio-temporal correlations on graphs [56.48498624951417]
This paper proposes a traffic propagation model that merges multiple heat diffusion kernels into a data-driven prediction model to forecast traffic signals.
We optimize the model parameters using Bayesian inference to minimize the prediction errors and, consequently, determine the mixing ratio of the two approaches.
The proposed model demonstrates prediction accuracy comparable to that of the state-of-the-art deep neural networks with lower computational effort.
arXiv Detail & Related papers (2021-04-27T18:17:42Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Dataset Cartography: Mapping and Diagnosing Datasets with Training
Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets.
We leverage a largely ignored source of information: the behavior of the model on individual instances during training.
Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z) - Mission-Aware Spatio-Temporal Deep Learning Model for UAS Instantaneous
Density Prediction [3.59465210252619]
Number of daily sUAS operations in uncontrolled low altitude airspace is expected to reach into the millions in a few years.
Deep learning-based UAS instantaneous density prediction model is presented.
arXiv Detail & Related papers (2020-03-22T02:40:28Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.