Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An
Information-Theoretic Framework
- URL: http://arxiv.org/abs/2009.06304v2
- Date: Thu, 17 Sep 2020 09:21:27 GMT
- Title: Demystifying Deep Learning in Predictive Spatio-Temporal Analytics: An
Information-Theoretic Framework
- Authors: Qi Tan, Yang Liu, Jiming Liu
- Abstract summary: We provide a comprehensive framework for deep learning model design and information-theoretic analysis.
First, we develop and demonstrate a novel interactively-connected deep recurrent neural network (I$2$DRNN) model.
Second, to theoretically prove that our designed model can learn multi-scale-temporal dependency in PSTA tasks, we provide an information-theoretic analysis.
- Score: 20.28063653485698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has achieved incredible success over the past years, especially
in various challenging predictive spatio-temporal analytics (PSTA) tasks, such
as disease prediction, climate forecast, and traffic prediction, where
intrinsic dependency relationships among data exist and generally manifest at
multiple spatio-temporal scales. However, given a specific PSTA task and the
corresponding dataset, how to appropriately determine the desired configuration
of a deep learning model, theoretically analyze the model's learning behavior,
and quantitatively characterize the model's learning capacity remains a
mystery. In order to demystify the power of deep learning for PSTA, in this
paper, we provide a comprehensive framework for deep learning model design and
information-theoretic analysis. First, we develop and demonstrate a novel
interactively- and integratively-connected deep recurrent neural network
(I$^2$DRNN) model. I$^2$DRNN consists of three modules: an Input module that
integrates data from heterogeneous sources; a Hidden module that captures the
information at different scales while allowing the information to flow
interactively between layers; and an Output module that models the integrative
effects of information from various hidden layers to generate the output
predictions. Second, to theoretically prove that our designed model can learn
multi-scale spatio-temporal dependency in PSTA tasks, we provide an
information-theoretic analysis to examine the information-based learning
capacity (i-CAP) of the proposed model. Third, to validate the I$^2$DRNN model
and confirm its i-CAP, we systematically conduct a series of experiments
involving both synthetic datasets and real-world PSTA tasks. The experimental
results show that the I$^2$DRNN model outperforms both classical and
state-of-the-art models, and is able to capture meaningful multi-scale
spatio-temporal dependency.
Related papers
- iNNspector: Visual, Interactive Deep Model Debugging [8.997568393450768]
We propose a conceptual framework structuring the data space of deep learning experiments.
Our framework captures design dimensions and proposes mechanisms to make this data explorable and tractable.
We present the iNNspector system, which enables tracking of deep learning experiments and provides interactive visualizations of the data.
arXiv Detail & Related papers (2024-07-25T12:48:41Z) - An Empirical Study of Deep Learning Models for Vulnerability Detection [4.243592852049963]
We surveyed and reproduced 9 state-of-the-art deep learning models on 2 widely used vulnerability detection datasets.
We investigated model capabilities, training data, and model interpretation.
Our findings can help better understand model results, provide guidance on preparing training data, and improve the robustness of the models.
arXiv Detail & Related papers (2022-12-15T19:49:34Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Learning Dynamics Models for Model Predictive Agents [28.063080817465934]
Model-Based Reinforcement Learning involves learning a textitdynamics model from data, and then using this model to optimise behaviour.
This paper sets out to disambiguate the role of different design choices for learning dynamics models, by comparing their performance to planning with a ground-truth model.
arXiv Detail & Related papers (2021-09-29T09:50:25Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Model Complexity of Deep Learning: A Survey [79.20117679251766]
We conduct a systematic overview of the latest studies on model complexity in deep learning.
We review the existing studies on those two categories along four important factors, including model framework, model size, optimization process and data complexity.
arXiv Detail & Related papers (2021-03-08T22:39:32Z) - Semi-Structured Deep Piecewise Exponential Models [2.7728956081909346]
We propose a versatile framework for survival analysis that combines advanced concepts from statistics with deep learning.
A proof of concept is provided by using the framework to predict Alzheimer's disease progression.
arXiv Detail & Related papers (2020-11-11T14:41:19Z) - PAC Bounds for Imitation and Model-based Batch Learning of Contextual
Markov Decision Processes [31.83144400718369]
We consider the problem of batch multi-task reinforcement learning with observed context descriptors, motivated by its application to personalized medical treatment.
We study two general classes of learning algorithms: direct policy learning (DPL), an imitation-learning based approach which learns from expert trajectories, and model-based learning.
arXiv Detail & Related papers (2020-06-11T11:57:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.