Improving Robustness and Uncertainty Modelling in Neural Ordinary
Differential Equations
- URL: http://arxiv.org/abs/2112.12707v1
- Date: Thu, 23 Dec 2021 16:56:10 GMT
- Title: Improving Robustness and Uncertainty Modelling in Neural Ordinary
Differential Equations
- Authors: Srinivas Anumasa, P.K. Srijith
- Abstract summary: We propose a novel approach to model uncertainty in NODE by considering a distribution over the end-time $T$ of the ODE solver.
We also propose, adaptive latent time NODE (ALT-NODE), which allow each data point to have a distinct posterior distribution over end-times.
We demonstrate the effectiveness of the proposed approaches in modelling uncertainty and robustness through experiments on synthetic and several real-world image classification data.
- Score: 0.2538209532048866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural ordinary differential equations (NODE) have been proposed as a
continuous depth generalization to popular deep learning models such as
Residual networks (ResNets). They provide parameter efficiency and automate the
model selection process in deep learning models to some extent. However, they
lack the much-required uncertainty modelling and robustness capabilities which
are crucial for their use in several real-world applications such as autonomous
driving and healthcare. We propose a novel and unique approach to model
uncertainty in NODE by considering a distribution over the end-time $T$ of the
ODE solver. The proposed approach, latent time NODE (LT-NODE), treats $T$ as a
latent variable and apply Bayesian learning to obtain a posterior distribution
over $T$ from the data. In particular, we use variational inference to learn an
approximate posterior and the model parameters. Prediction is done by
considering the NODE representations from different samples of the posterior
and can be done efficiently using a single forward pass. As $T$ implicitly
defines the depth of a NODE, posterior distribution over $T$ would also help in
model selection in NODE. We also propose, adaptive latent time NODE (ALT-NODE),
which allow each data point to have a distinct posterior distribution over
end-times. ALT-NODE uses amortized variational inference to learn an
approximate posterior using inference networks. We demonstrate the
effectiveness of the proposed approaches in modelling uncertainty and
robustness through experiments on synthetic and several real-world image
classification data.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Latent Time Neural Ordinary Differential Equations [0.2538209532048866]
We propose a novel approach to model uncertainty in NODE by considering a distribution over the end-time $T$ of the ODE solver.
We also propose, adaptive latent time NODE (ALT-NODE), which allow each data point to have a distinct posterior distribution over end-times.
We demonstrate the effectiveness of the proposed approaches in modelling uncertainty and robustness through experiments on synthetic and several real-world image classification data.
arXiv Detail & Related papers (2021-12-23T17:31:47Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Parsimony-Enhanced Sparse Bayesian Learning for Robust Discovery of
Partial Differential Equations [5.584060970507507]
A Parsimony Enhanced Sparse Bayesian Learning (PeSBL) method is developed for discovering the governing Partial Differential Equations (PDEs) of nonlinear dynamical systems.
Results of numerical case studies indicate that the governing PDEs of many canonical dynamical systems can be correctly identified using the proposed PeSBL method.
arXiv Detail & Related papers (2021-07-08T00:56:11Z) - Fully differentiable model discovery [0.0]
We propose an approach by combining neural network based surrogates with Sparse Bayesian Learning.
Our work expands PINNs to various types of neural network architectures, and connects neural network-based surrogates to the rich field of Bayesian parameter inference.
arXiv Detail & Related papers (2021-06-09T08:11:23Z) - Score-Based Generative Modeling through Stochastic Differential
Equations [114.39209003111723]
We present a differential equation that transforms a complex data distribution to a known prior distribution by injecting noise.
A corresponding reverse-time SDE transforms the prior distribution back into the data distribution by slowly removing the noise.
By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks.
We demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
arXiv Detail & Related papers (2020-11-26T19:39:10Z) - Sparsely constrained neural networks for model discovery of PDEs [0.0]
We present a modular framework that determines the sparsity pattern of a deep-learning based surrogate using any sparse regression technique.
We show how a different network architecture and sparsity estimator improve model discovery accuracy and convergence on several benchmark examples.
arXiv Detail & Related papers (2020-11-09T11:02:40Z) - Neural Jump Ordinary Differential Equations: Consistent Continuous-Time
Prediction and Filtering [6.445605125467574]
We introduce the Neural Jump ODE (NJ-ODE) that provides a data-driven approach to learn, continuously in time.
We show that our model converges to the $L2$-optimal online prediction.
We experimentally show that our model outperforms the baselines in more complex learning tasks.
arXiv Detail & Related papers (2020-06-08T16:34:51Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.