Related papers: Sparse Flows: Pruning Continuous-depth Models

Sparse Flows: Pruning Continuous-depth Models

URL: http://arxiv.org/abs/2106.12718v1
Date: Thu, 24 Jun 2021 01:40:17 GMT
Title: Sparse Flows: Pruning Continuous-depth Models
Authors: Lucas Liebenwein, Ramin Hasani, Alexander Amini, Daniela Rus
Abstract summary: We show that pruning improves generalization for neural ODEs in generative modeling. We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
Score: 107.98191032466544
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continuous deep learning architectures enable learning of flexible probabilistic models for predictive modeling as neural ordinary differential equations (ODEs), and for generative modeling as continuous normalizing flows. In this work, we design a framework to decipher the internal dynamics of these continuous depth models by pruning their network architectures. Our empirical results suggest that pruning improves generalization for neural ODEs in generative modeling. Moreover, pruning finds minimal and efficient neural ODE representations with up to 98\% less parameters compared to the original network, without loss of accuracy. Finally, we show that by applying pruning we can obtain insightful information about the design of better neural ODEs.We hope our results will invigorate further research into the performance-size trade-offs of modern continuous-depth models.

Related papers

Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships. Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation [33.63030304318472]
We propose a novel architecture-aware neural differential equation model to forecast learning curves continuously. Our model outperforms current state-of-the-art learning curve methods and extrapolation approaches for both pure time-series modeling and CNN-based learning curves.
arXiv Detail & Related papers (2024-12-20T04:28:02Z)
Jet: A Modern Transformer-Based Normalizing Flow [62.2573739835562]
We revisit the design of the coupling-based normalizing flow models by carefully ablating prior design choices. We achieve state-of-the-art quantitative and qualitative performance with a much simpler architecture.
arXiv Detail & Related papers (2024-12-19T18:09:42Z)
Neural Residual Diffusion Models for Deep Scalable Vision Generation [17.931568104324985]
We propose a unified and massively scalable Neural Residual Diffusion Models framework (Neural-RDM) The proposed neural residual models obtain state-of-the-art scores on image's and video's generative benchmarks.
arXiv Detail & Related papers (2024-06-19T04:57:18Z)
On the Trade-off Between Efficiency and Precision of Neural Abstraction [62.046646433536104]
Neural abstractions have been recently introduced as formal approximations of complex, nonlinear dynamical models. We employ formal inductive synthesis procedures to generate neural abstractions that result in dynamical models with these semantics.
arXiv Detail & Related papers (2023-07-28T13:22:32Z)
Do We Need an Encoder-Decoder to Model Dynamical Systems on Networks? [18.92828441607381]
We show that embeddings induce a model that fits observations well but simultaneously has incorrect dynamical behaviours. We propose a simple embedding-free alternative based on parametrising two additive vector-field components.
arXiv Detail & Related papers (2023-05-20T12:41:47Z)
Standalone Neural ODEs with Sensitivity Analysis [5.565364597145569]
This paper presents a continuous-depth neural ODE model capable of describing a full deep neural network. We present a general formulation of the neural sensitivity problem and show how it is used in the NCG training. Our evaluations demonstrate that our novel formulations lead to increased robustness and performance as compared to ResNet models.
arXiv Detail & Related papers (2022-05-27T12:16:53Z)
EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting. We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Neural Closure Models for Dynamical Systems [35.000303827255024]
We develop a novel methodology to learn non-Markovian closure parameterizations for low-fidelity models. New "neural closure models" augment low-fidelity models with neural delay differential equations (nDDEs) We show that using non-Markovian over Markovian closures improves long-term accuracy and requires smaller networks.
arXiv Detail & Related papers (2020-12-27T05:55:33Z)
Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead. We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.