TASI Lectures on Physics for Machine Learning
- URL: http://arxiv.org/abs/2408.00082v1
- Date: Wed, 31 Jul 2024 18:00:22 GMT
- Title: TASI Lectures on Physics for Machine Learning
- Authors: Jim Halverson,
- Abstract summary: Notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning.
The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: These notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning. The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics. I present classic results such as the universal approximation theorem and neural network / Gaussian process correspondence, and also more recent results such as the neural tangent kernel, feature learning with the maximal update parameterization, and Kolmogorov-Arnold networks. The exposition on neural network theory emphasizes a field theoretic perspective familiar to theoretical physicists. I elaborate on connections between the two, including a neural network approach to field theory.
Related papers
- Collective variables of neural networks: empirical time evolution and scaling laws [0.535514140374842]
We show that certain measures on the spectrum of the empirical neural tangent kernel, specifically entropy and trace, yield insight into the representations learned by a neural network.
Results are demonstrated first on test cases before being shown on more complex networks, including transformers, auto-encoders, graph neural networks, and reinforcement learning studies.
arXiv Detail & Related papers (2024-10-09T21:37:14Z) - Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - Conditional computation in neural networks: principles and research trends [48.14569369912931]
This article summarizes principles and ideas from the emerging area of applying textitconditional computation methods to the design of neural networks.
In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input.
arXiv Detail & Related papers (2024-03-12T11:56:38Z) - Connecting NTK and NNGP: A Unified Theoretical Framework for Neural
Network Learning Dynamics in the Kernel Regime [7.136205674624813]
We provide a comprehensive framework for understanding the learning process of deep neural networks in the infinite width limit.
We identify two learning phases characterized by different time scales: gradient-driven and diffusive learning.
arXiv Detail & Related papers (2023-09-08T18:00:01Z) - The semantic landscape paradigm for neural networks [0.0]
We introduce the semantic landscape paradigm, a conceptual and mathematical framework that describes the training dynamics of neural networks.
Specifically, we show that grokking and emergence with scale are associated with percolation phenomena, and neural scaling laws are explainable in terms of the statistics of random walks on graphs.
arXiv Detail & Related papers (2023-07-18T18:48:54Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects.
We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations.
Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z) - Mathematical Models of Overparameterized Neural Networks [25.329225766892126]
We will focus on the analysis of two-layer neural networks, and explain the key mathematical models.
We will then discuss challenges in understanding deep neural networks and some current research directions.
arXiv Detail & Related papers (2020-12-27T17:48:31Z) - Mastering high-dimensional dynamics with Hamiltonian neural networks [0.0]
A map building perspective elucidates the superiority of Hamiltonian neural networks over conventional neural networks.
The results clarify the critical relation between data, dimension, and neural network learning performance.
arXiv Detail & Related papers (2020-07-28T21:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.