The Mori-Zwanzig formulation of deep learning
- URL: http://arxiv.org/abs/2209.05544v4
- Date: Fri, 19 May 2023 18:49:12 GMT
- Title: The Mori-Zwanzig formulation of deep learning
- Authors: Daniele Venturi and Xiantao Li
- Abstract summary: We develop a new formulation of deep learning based on the Mori-Zwanzig formalism of irreversible statistical mechanics.
New equations can be used as a starting point to develop new effective parameterizations of deep neural networks.
- Score: 3.2851683371946754
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a new formulation of deep learning based on the Mori-Zwanzig (MZ)
formalism of irreversible statistical mechanics. The new formulation is built
upon the well-known duality between deep neural networks and discrete dynamical
systems, and it allows us to directly propagate quantities of interest
(conditional expectations and probability density functions) forward and
backward through the network by means of exact linear operator equations. Such
new equations can be used as a starting point to develop new effective
parameterizations of deep neural networks, and provide a new framework to study
deep-learning via operator theoretic methods. The proposed MZ formulation of
deep learning naturally introduces a new concept, i.e., the memory of the
neural network, which plays a fundamental role in low-dimensional modeling and
parameterization. By using the theory of contraction mappings, we develop
sufficient conditions for the memory of the neural network to decay with the
number of layers. This allows us to rigorously transform deep networks into
shallow ones, e.g., by reducing the number of neurons per layer (using
projection operators), or by reducing the total number of layers (using the
decay property of the memory operator).
Related papers
- An Analysis Framework for Understanding Deep Neural Networks Based on Network Dynamics [11.44947569206928]
Deep neural networks (DNNs) maximize information extraction by rationally allocating the proportion of neurons in different modes across deep layers.
This framework provides a unified explanation for fundamental DNN behaviors such as the "flat minima effect," "grokking," and double descent phenomena.
arXiv Detail & Related papers (2025-01-05T04:23:21Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Growing Deep Neural Network Considering with Similarity between Neurons [4.32776344138537]
We explore a novel approach of progressively increasing neuron numbers in compact models during training phases.
We propose a method that reduces feature extraction biases and neuronal redundancy by introducing constraints based on neuron similarity distributions.
Results on CIFAR-10 and CIFAR-100 datasets demonstrated accuracy improvement.
arXiv Detail & Related papers (2024-08-23T11:16:37Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - Triple Memory Networks: a Brain-Inspired Method for Continual Learning [35.40452724755021]
A neural network adjusts its parameters when learning a new task, but then fails to conduct the old tasks well.
The brain has a powerful ability to continually learn new experience without catastrophic interference.
Inspired by such brain strategy, we propose a novel approach named triple memory networks (TMNs) for continual learning.
arXiv Detail & Related papers (2020-03-06T11:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.