Entropic alternatives to initialization
- URL: http://arxiv.org/abs/2107.07757v1
- Date: Fri, 16 Jul 2021 08:17:32 GMT
- Title: Entropic alternatives to initialization
- Authors: Daniele Musso
- Abstract summary: We analyze anisotropic, local entropic smoothenings in the language of statistical physics and information theory.
We comment some aspects related to the physics of renormalization and the spacetime structure of convolutional networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local entropic loss functions provide a versatile framework to define
architecture-aware regularization procedures. Besides the possibility of being
anisotropic in the synaptic space, the local entropic smoothening of the loss
function can vary during training, thus yielding a tunable model complexity. A
scoping protocol where the regularization is strong in the early-stage of the
training and then fades progressively away constitutes an alternative to
standard initialization procedures for deep convolutional neural networks,
nonetheless, it has wider applicability. We analyze anisotropic, local entropic
smoothenings in the language of statistical physics and information theory,
providing insight into both their interpretation and workings. We comment some
aspects related to the physics of renormalization and the spacetime structure
of convolutional networks.
Related papers
- Relative Representations: Topological and Geometric Perspectives [53.88896255693922]
Relative representations are an established approach to zero-shot model stitching.
We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.
Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z) - Neural Incremental Data Assimilation [8.817223931520381]
We introduce a deep learning approach where the physical system is modeled as a sequence of coarse-to-fine Gaussian prior distributions parametrized by a neural network.
This allows us to define an assimilation operator, which is trained in an end-to-end fashion to minimize the reconstruction error.
We illustrate our approach on chaotic dynamical physical systems with sparse observations, and compare it to traditional variational data assimilation methods.
arXiv Detail & Related papers (2024-06-21T11:42:55Z) - Modeling Spatio-temporal Dynamical Systems with Neural Discrete Learning
and Levels-of-Experts [33.335735613579914]
We address the issue of modeling and estimating changes in the state oftemporal- dynamical systems based on a sequence of observations like video frames.
This paper propose the universal expert module -- that is, optical flow estimation component, to capture the laws of general physical processes in a data-driven fashion.
We conduct extensive experiments and ablations to demonstrate that the proposed framework achieves large performance margins, compared with the existing SOTA baselines.
arXiv Detail & Related papers (2024-02-06T06:27:07Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Machine learning in and out of equilibrium [58.88325379746631]
Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels.
We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium.
We propose a new variation of Langevin dynamics (SGLD) that harnesses without replacement minibatching.
arXiv Detail & Related papers (2023-06-06T09:12:49Z) - Physics informed neural networks for continuum micromechanics [68.8204255655161]
Recently, physics informed neural networks have successfully been applied to a broad variety of problems in applied mathematics and engineering.
Due to the global approximation, physics informed neural networks have difficulties in displaying localized effects and strong non-linear solutions by optimization.
It is shown, that the domain decomposition approach is able to accurately resolve nonlinear stress, displacement and energy fields in heterogeneous microstructures obtained from real-world $mu$CT-scans.
arXiv Detail & Related papers (2021-10-14T14:05:19Z) - Characterizing possible failure modes in physics-informed neural
networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models.
We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs.
We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z) - Partial local entropy and anisotropy in deep weight spaces [0.0]
We refine a recently-proposed class of local entropic loss functions by restricting the smoothening regularization to only a subset of weights.
The new loss functions are referred to as partial local entropies. They can adapt to the weight-space anisotropy, thus outperforming their isotropic counterparts.
arXiv Detail & Related papers (2020-07-17T16:16:18Z) - Phase space learning with neural networks [0.0]
This work proposes an autoencoder neural network as a non-linear generalization of projection-based methods for solving Partial Differential Equations (PDEs)
The proposed deep learning architecture is capable of generating the dynamics of PDEs by integrating them completely in a very reduced latent space without intermediate reconstructions, to then decode the latent solution back to the original space.
It is shown the reliability of properly regularized neural networks to learn the global characteristics of a dynamical system's phase space from the sample data of a single path, as well as its ability to predict unseen bifurcations.
arXiv Detail & Related papers (2020-06-22T20:28:07Z) - A nonlocal physics-informed deep learning framework using the
peridynamic differential operator [0.0]
We develop a nonlocal PINN approach using the Peridynamic Differential Operator (PDDO)---a numerical method which incorporates long-range interactions and removes spatial derivatives in the governing equations.
Because the PDDO functions can be readily incorporated in the neural network architecture, the nonlocality does not degrade the performance of modern deep-learning algorithms.
We document the superior behavior of nonlocal PINN with respect to local PINN in both solution accuracy and parameter inference.
arXiv Detail & Related papers (2020-05-31T06:26:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.