Related papers: WeightScale: Interpreting Weight Change in Neural Networks

WeightScale: Interpreting Weight Change in Neural Networks

URL: http://arxiv.org/abs/2107.07005v1
Date: Wed, 7 Jul 2021 21:18:38 GMT
Title: WeightScale: Interpreting Weight Change in Neural Networks
Authors: Ayush Manish Agrawal, Atharva Tendle, Harshvardhan Sikka, Sahib Singh
Abstract summary: We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis. We use this approach to investigate learning in the context of vision tasks across a variety of state-of-the-art networks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Interpreting the learning dynamics of neural networks can provide useful insights into how networks learn and the development of better training and design approaches. We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis and dynamically aggregating emerging trends through combination of dimensionality reduction and clustering which allows us to scale to very deep networks. We use this approach to investigate learning in the context of vision tasks across a variety of state-of-the-art networks and provide insights into the learning behavior of these networks, including how task complexity affects layer-wise learning in deeper layers of networks.

Related papers

From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation. Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically. These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Dynamical stability and chaos in artificial neural network trajectories along training [3.379574469735166]
We study the dynamical properties of this process by analyzing through this lens the network trajectories of a shallow neural network. We find hints of regular and chaotic behavior depending on the learning rate regime. This work also contributes to the cross-fertilization of ideas between dynamical systems theory, network theory and machine learning.
arXiv Detail & Related papers (2024-04-08T17:33:11Z)
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks. This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
Wide Neural Networks Forget Less Catastrophically [39.907197907411266]
We study the impact of "width" of the neural network architecture on catastrophic forgetting. We study the learning dynamics of the network from various perspectives.
arXiv Detail & Related papers (2021-10-21T23:49:23Z)
Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks [1.0869257688521987]
Complex Network Theory (CNT) represents Deep Neural Networks (DNNs) as directed weighted graphs to study them as dynamical systems. We introduce metrics for nodes/neurons and layers, namely Nodes Strength and Layers Fluctuation. Our framework distills trends in the learning dynamics and separates low from high accurate networks.
arXiv Detail & Related papers (2021-10-06T10:03:32Z)
Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects. We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations. Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z)
Learning low-rank latent mesoscale structures in networks [1.1470070927586016]
We present a new approach for describing low-rank mesoscale structures in networks. We use several synthetic network models and empirical friendship, collaboration, and protein--protein interaction (PPI) networks. We show how to denoise a corrupted network by using only the latent motifs that one learns directly from the corrupted network.
arXiv Detail & Related papers (2021-02-13T18:54:49Z)
Investigating Learning in Deep Neural Networks using Layer-Wise Weight Change [0.7829352305480285]
We investigate learning in Deep Convolutional Neural Networks (CNNs) by measuring the relative weight change of layers while training. Several interesting trends emerge in various CNN architectures across various computer vision classification tasks.
arXiv Detail & Related papers (2020-11-13T02:53:41Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.