Sparse Flows: Pruning Continuous-depth Models
- URL: http://arxiv.org/abs/2106.12718v1
- Date: Thu, 24 Jun 2021 01:40:17 GMT
- Title: Sparse Flows: Pruning Continuous-depth Models
- Authors: Lucas Liebenwein, Ramin Hasani, Alexander Amini, Daniela Rus
- Abstract summary: We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
- Score: 107.98191032466544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continuous deep learning architectures enable learning of flexible
probabilistic models for predictive modeling as neural ordinary differential
equations (ODEs), and for generative modeling as continuous normalizing flows.
In this work, we design a framework to decipher the internal dynamics of these
continuous depth models by pruning their network architectures. Our empirical
results suggest that pruning improves generalization for neural ODEs in
generative modeling. Moreover, pruning finds minimal and efficient neural ODE
representations with up to 98\% less parameters compared to the original
network, without loss of accuracy. Finally, we show that by applying pruning we
can obtain insightful information about the design of better neural ODEs.We
hope our results will invigorate further research into the performance-size
trade-offs of modern continuous-depth models.
Related papers
- Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships.
Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z) - Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation [33.63030304318472]
We propose a novel architecture-aware neural differential equation model to forecast learning curves continuously.
Our model outperforms current state-of-the-art learning curve methods and extrapolation approaches for both pure time-series modeling and CNN-based learning curves.
arXiv Detail & Related papers (2024-12-20T04:28:02Z) - Jet: A Modern Transformer-Based Normalizing Flow [62.2573739835562]
We revisit the design of the coupling-based normalizing flow models by carefully ablating prior design choices.
We achieve state-of-the-art quantitative and qualitative performance with a much simpler architecture.
arXiv Detail & Related papers (2024-12-19T18:09:42Z) - Neural Residual Diffusion Models for Deep Scalable Vision Generation [17.931568104324985]
We propose a unified and massively scalable Neural Residual Diffusion Models framework (Neural-RDM)
The proposed neural residual models obtain state-of-the-art scores on image's and video's generative benchmarks.
arXiv Detail & Related papers (2024-06-19T04:57:18Z) - On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models.
We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z) - Do We Need an Encoder-Decoder to Model Dynamical Systems on Networks? [18.92828441607381]
We show that embeddings induce a model that fits observations well but simultaneously has incorrect dynamical behaviours.
We propose a simple embedding-free alternative based on parametrising two additive vector-field components.
arXiv Detail & Related papers (2023-05-20T12:41:47Z) - Standalone Neural ODEs with Sensitivity Analysis [5.565364597145569]
This paper presents a continuous-depth neural ODE model capable of describing a full deep neural network.
We present a general formulation of the neural sensitivity problem and show how it is used in the NCG training.
Our evaluations demonstrate that our novel formulations lead to increased robustness and performance as compared to ResNet models.
arXiv Detail & Related papers (2022-05-27T12:16:53Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.