Machine Learning from a Continuous Viewpoint
- URL: http://arxiv.org/abs/1912.12777v2
- Date: Sat, 26 Sep 2020 04:15:09 GMT
- Title: Machine Learning from a Continuous Viewpoint
- Authors: Weinan E, Chao Ma, Lei Wu
- Abstract summary: We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations.
We discuss how the issues of generalization error and implicit regularization can be studied under this framework.
- Score: 12.865834066050427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a continuous formulation of machine learning, as a problem in the
calculus of variations and differential-integral equations, in the spirit of
classical numerical analysis. We demonstrate that conventional machine learning
models and algorithms, such as the random feature model, the two-layer neural
network model and the residual neural network model, can all be recovered (in a
scaled form) as particular discretizations of different continuous
formulations. We also present examples of new models, such as the flow-based
random feature model, and new algorithms, such as the smoothed particle method
and spectral method, that arise naturally from this continuous formulation. We
discuss how the issues of generalization error and implicit regularization can
be studied under this framework.
Related papers
- Comprehensive Review of Neural Differential Equations for Time Series Analysis [2.9687381456164004]
This paper presents a comprehensive review of NDE-based methods for time series analysis.
NDEs represent a paradigm shift by combining the flexibility of neural networks with the mathematical rigor of differential equations.
We provide a detailed discussion of their mathematical formulations, numerical methods, and applications, highlighting their ability to model continuous-time dynamics.
arXiv Detail & Related papers (2025-02-14T03:21:04Z) - Generative Modeling of Neural Dynamics via Latent Stochastic Differential Equations [1.5467259918426441]
We propose a framework for developing computational models of biological neural systems.
We employ a system of coupled differential equations with differentiable drift and diffusion functions.
We show that these hybrid models achieve competitive performance in predicting stimulus-evoked neural and behavioral responses.
arXiv Detail & Related papers (2024-12-01T09:36:03Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Generative Learning of Continuous Data by Tensor Networks [45.49160369119449]
We introduce a new family of tensor network generative models for continuous data.
We benchmark the performance of this model on several synthetic and real-world datasets.
Our methods give important theoretical and empirical evidence of the efficacy of quantum-inspired methods for the rapidly growing field of generative learning.
arXiv Detail & Related papers (2023-10-31T14:37:37Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Generalized Neural Closure Models with Interpretability [28.269731698116257]
We develop a novel and versatile methodology of unified neural partial delay differential equations.
We augment existing/low-fidelity dynamical models directly in their partial differential equation (PDE) forms with both Markovian and non-Markovian neural network (NN) closure parameterizations.
We demonstrate the new generalized neural closure models (gnCMs) framework using four sets of experiments based on advecting nonlinear waves, shocks, and ocean acidification models.
arXiv Detail & Related papers (2023-01-15T21:57:43Z) - Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems [26.744964200606784]
We propose a novel inference algorithm utilizing a Markov Chain Monte Carlo approach.
The presented Gibbs sampler allows to efficiently obtain samples from the exact continuous-time posterior processes.
arXiv Detail & Related papers (2022-05-18T09:03:00Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.