WeightScale: Interpreting Weight Change in Neural Networks
- URL: http://arxiv.org/abs/2107.07005v1
- Date: Wed, 7 Jul 2021 21:18:38 GMT
- Title: WeightScale: Interpreting Weight Change in Neural Networks
- Authors: Ayush Manish Agrawal, Atharva Tendle, Harshvardhan Sikka, Sahib Singh
- Abstract summary: We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis.
We use this approach to investigate learning in the context of vision tasks across a variety of state-of-the-art networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interpreting the learning dynamics of neural networks can provide useful
insights into how networks learn and the development of better training and
design approaches. We present an approach to interpret learning in neural
networks by measuring relative weight change on a per layer basis and
dynamically aggregating emerging trends through combination of dimensionality
reduction and clustering which allows us to scale to very deep networks. We use
this approach to investigate learning in the context of vision tasks across a
variety of state-of-the-art networks and provide insights into the learning
behavior of these networks, including how task complexity affects layer-wise
learning in deeper layers of networks.
Related papers
- From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation.
Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically.
These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z) - Dynamical stability and chaos in artificial neural network trajectories along training [3.379574469735166]
We study the dynamical properties of this process by analyzing through this lens the network trajectories of a shallow neural network.
We find hints of regular and chaotic behavior depending on the learning rate regime.
This work also contributes to the cross-fertilization of ideas between dynamical systems theory, network theory and machine learning.
arXiv Detail & Related papers (2024-04-08T17:33:11Z) - Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural
Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks.
This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Wide Neural Networks Forget Less Catastrophically [39.907197907411266]
We study the impact of "width" of the neural network architecture on catastrophic forgetting.
We study the learning dynamics of the network from various perspectives.
arXiv Detail & Related papers (2021-10-21T23:49:23Z) - Characterizing Learning Dynamics of Deep Neural Networks via Complex
Networks [1.0869257688521987]
Complex Network Theory (CNT) represents Deep Neural Networks (DNNs) as directed weighted graphs to study them as dynamical systems.
We introduce metrics for nodes/neurons and layers, namely Nodes Strength and Layers Fluctuation.
Our framework distills trends in the learning dynamics and separates low from high accurate networks.
arXiv Detail & Related papers (2021-10-06T10:03:32Z) - Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects.
We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations.
Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z) - Learning low-rank latent mesoscale structures in networks [1.1470070927586016]
We present a new approach for describing low-rank mesoscale structures in networks.
We use several synthetic network models and empirical friendship, collaboration, and protein--protein interaction (PPI) networks.
We show how to denoise a corrupted network by using only the latent motifs that one learns directly from the corrupted network.
arXiv Detail & Related papers (2021-02-13T18:54:49Z) - Investigating Learning in Deep Neural Networks using Layer-Wise Weight
Change [0.7829352305480285]
We investigate learning in Deep Convolutional Neural Networks (CNNs) by measuring the relative weight change of layers while training.
Several interesting trends emerge in various CNN architectures across various computer vision classification tasks.
arXiv Detail & Related papers (2020-11-13T02:53:41Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.