Sparsity in Continuous-Depth Neural Networks
- URL: http://arxiv.org/abs/2210.14672v1
- Date: Wed, 26 Oct 2022 12:48:12 GMT
- Title: Sparsity in Continuous-Depth Neural Networks
- Authors: Hananeh Aliee, Till Richter, Mikhail Solonin, Ignacio Ibarra, Fabian
Theis, Niki Kilbertus
- Abstract summary: We study the influence of weight and feature sparsity on forecasting and on identifying the underlying dynamical laws.
We curate real-world datasets consisting of human motion capture and human hematopoiesis single-cell RNA-seq data.
- Score: 2.969794498016257
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Neural Ordinary Differential Equations (NODEs) have proven successful in
learning dynamical systems in terms of accurately recovering the observed
trajectories. While different types of sparsity have been proposed to improve
robustness, the generalization properties of NODEs for dynamical systems beyond
the observed data are underexplored. We systematically study the influence of
weight and feature sparsity on forecasting as well as on identifying the
underlying dynamical laws. Besides assessing existing methods, we propose a
regularization technique to sparsify "input-output connections" and extract
relevant features during training. Moreover, we curate real-world datasets
consisting of human motion capture and human hematopoiesis single-cell RNA-seq
data to realistically analyze different levels of out-of-distribution (OOD)
generalization in forecasting and dynamics identification respectively. Our
extensive empirical evaluation on these challenging benchmarks suggests that
weight sparsity improves generalization in the presence of noise or irregular
sampling. However, it does not prevent learning spurious feature dependencies
in the inferred dynamics, rendering them impractical for predictions under
interventions, or for inferring the true underlying dynamics. Instead, feature
sparsity can indeed help with recovering sparse ground-truth dynamics compared
to unregularized NODEs.
Related papers
- Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity [51.40558987254471]
Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations.
This paper addresses the question of reinforcement learning under $textitgeneral$ latent dynamics from a statistical and algorithmic perspective.
arXiv Detail & Related papers (2024-10-23T14:22:49Z) - Learning Continuous Network Emerging Dynamics from Scarce Observations
via Data-Adaptive Stochastic Processes [11.494631894700253]
We introduce ODE Processes for Network Dynamics (NDP4ND), a new class of processes governed by data-adaptive network dynamics.
We show that the proposed method has excellent data and computational efficiency, and can adapt to unseen network emerging dynamics.
arXiv Detail & Related papers (2023-10-25T08:44:05Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Learning Fine Scale Dynamics from Coarse Observations via Inner
Recurrence [0.0]
Recent work has focused on data-driven learning of the evolution of unknown systems via deep neural networks (DNNs)
This paper presents a computational technique to learn the fine-scale dynamics from such coarsely observed data.
arXiv Detail & Related papers (2022-06-03T20:28:52Z) - Capturing Actionable Dynamics with Structured Latent Ordinary
Differential Equations [68.62843292346813]
We propose a structured latent ODE model that captures system input variations within its latent representation.
Building on a static variable specification, our model learns factors of variation for each input to the system, thus separating the effects of the system inputs in the latent space.
arXiv Detail & Related papers (2022-02-25T20:00:56Z) - Continuous Forecasting via Neural Eigen Decomposition of Stochastic
Dynamics [47.82509795873254]
We introduce the Neural Eigen-SDE (NESDE) algorithm for sequential prediction with sparse observations and adaptive dynamics.
NESDE applies eigen-decomposition to the dynamics model to allow efficient frequent predictions given sparse observations.
We are the first to provide a patient-adapted prediction for blood coagulation following Heparin dosing in the MIMIC-IV dataset.
arXiv Detail & Related papers (2022-01-31T22:16:50Z) - Noisy Recurrent Neural Networks [45.94390701863504]
We study recurrent neural networks (RNNs) trained by injecting noise into hidden states as discretizations of differential equations driven by input data.
We find that, under reasonable assumptions, this implicit regularization promotes flatter minima; it biases towards models with more stable dynamics; and, in classification tasks, it favors models with larger classification margin.
Our theory is supported by empirical results which demonstrate improved robustness with respect to various input perturbations, while maintaining state-of-the-art performance.
arXiv Detail & Related papers (2021-02-09T15:20:50Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.