On How Iterative Magnitude Pruning Discovers Local Receptive Fields in Fully Connected Neural Networks
- URL: http://arxiv.org/abs/2412.06545v1
- Date: Mon, 09 Dec 2024 14:56:23 GMT
- Title: On How Iterative Magnitude Pruning Discovers Local Receptive Fields in Fully Connected Neural Networks
- Authors: William T. Redman, Zhangyang Wang, Alessandro Ingrosso, Sebastian Goldt,
- Abstract summary: iterative magnitude pruning (IMP) has become a popular method for extracting sparseworks that can be trained to high performance.<n>Recent work has shown that applying IMP to fully connected neural networks (FCNs) leads to the emergence of local receptive fields (RFs)<n>We propose that IMP iteratively maximizes the non-Gaussian statistics present in the representations of FCNs, creating a feedback loop that enhances localization.
- Score: 92.66231524298554
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Since its use in the Lottery Ticket Hypothesis, iterative magnitude pruning (IMP) has become a popular method for extracting sparse subnetworks that can be trained to high performance. Despite this, the underlying nature of IMP's general success remains unclear. One possibility is that IMP is especially capable of extracting and maintaining strong inductive biases. In support of this, recent work has shown that applying IMP to fully connected neural networks (FCNs) leads to the emergence of local receptive fields (RFs), an architectural feature present in mammalian visual cortex and convolutional neural networks. The question of how IMP is able to do this remains unanswered. Inspired by results showing that training FCNs on synthetic images with highly non-Gaussian statistics (e.g., sharp edges) is sufficient to drive the formation of local RFs, we hypothesize that IMP iteratively maximizes the non-Gaussian statistics present in the representations of FCNs, creating a feedback loop that enhances localization. We develop a new method for measuring the effect of individual weights on the statistics of the FCN representations ("cavity method"), which allows us to find evidence in support of this hypothesis. Our work, which is the first to study the effect IMP has on the representations of neural networks, sheds parsimonious light one way in which IMP can drive the formation of strong inductive biases.
Related papers
- Bayesian Reasoning Enabled by Spin-Orbit Torque Magnetic Tunnel Junctions [7.081096702778852]
We present proof-of-concept experiments demonstrating the use of spin-orbit torque magnetic tunnel junctions (SOT-MTJs) in Bayesian network reasoning.
The parameters of the network can also approach the optimum through a simple point-by-point training algorithm.
We developed a simple medical diagnostic system using the SOT-MTJ as a random number generator and sampler.
arXiv Detail & Related papers (2025-04-11T05:02:27Z) - Out-of-Distribution Detection using Neural Activation Prior [15.673290330356194]
Out-of-distribution detection (OOD) is a crucial technique for deploying machine learning models in the real world.
We propose a simple yet effective Neural Activation Prior (NAP) for OOD detection.
Our method achieves the state-of-the-art performance on CIFAR benchmark and ImageNet dataset.
arXiv Detail & Related papers (2024-02-28T08:45:07Z) - Beyond IID weights: sparse and low-rank deep Neural Networks are also Gaussian Processes [3.686808512438363]
We extend the proof of Matthews et al. to a larger class of initial weight distributions.
We show that fully-connected and convolutional networks with PSEUDO-IID distributions are all effectively equivalent up to their variance.
Using our results, one can identify the Edge-of-Chaos for a broader class of neural networks and tune them at criticality in order to enhance their training.
arXiv Detail & Related papers (2023-10-25T12:38:36Z) - Approximate Thompson Sampling via Epistemic Neural Networks [26.872304174606278]
Epistemic neural networks (ENNs) are designed to produce accurate joint predictive distributions.
We show that ENNs serve this purpose well and illustrate how the quality of joint predictive distributions drives performance.
arXiv Detail & Related papers (2023-02-18T01:58:15Z) - Why Neural Networks Work [0.32228025627337864]
We argue that many properties of fully-connected feedforward neural networks (FCNNs) are explainable from the analysis of a single pair of operations.
We show how expand-and-sparsify can explain the observed phenomena that have been discussed in the literature.
arXiv Detail & Related papers (2022-11-26T18:15:17Z) - Increasing the Accuracy of a Neural Network Using Frequency Selective
Mesh-to-Grid Resampling [4.211128681972148]
We propose the use of keypoint frequency selective mesh-to-grid resampling (FSMR) for the processing of input data for neural networks.
We show that depending on the network architecture and classification task the application of FSMR during training aids learning process.
The classification accuracy can be increased by up to 4.31 percentage points for ResNet50 and the Oxflower17 dataset.
arXiv Detail & Related papers (2022-09-28T21:34:47Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - A tensor network representation of path integrals: Implementation and
analysis [0.0]
We introduce a novel tensor network-based decomposition of path integral simulations involving Feynman-Vernon influence functional.
The finite temporarily non-local interactions introduced by the influence functional can be captured very efficiently using matrix product state representation.
The flexibility of the AP-TNPI framework makes it a promising new addition to the family of path integral methods for non-equilibrium quantum dynamics.
arXiv Detail & Related papers (2021-06-23T16:41:54Z) - Towards Evaluating and Training Verifiably Robust Neural Networks [81.39994285743555]
We study the relationship between IBP and CROWN, and prove that CROWN is always tighter than IBP when choosing appropriate bounding lines.
We propose a relaxed version of CROWN, linear bound propagation (LBP), that can be used to verify large networks to obtain lower verified errors.
arXiv Detail & Related papers (2021-04-01T13:03:48Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.