Understanding Weight Similarity of Neural Networks via Chain
Normalization Rule and Hypothesis-Training-Testing
- URL: http://arxiv.org/abs/2208.04369v1
- Date: Mon, 8 Aug 2022 19:11:03 GMT
- Title: Understanding Weight Similarity of Neural Networks via Chain
Normalization Rule and Hypothesis-Training-Testing
- Authors: Guangcong Wang and Guangrun Wang and Wenqi Liang and Jianhuang Lai
- Abstract summary: We present a weight similarity measure that can quantify the weight similarity of non-volution neural networks.
We first normalize the weights of neural networks by a chain normalization rule, which is used to introduce weight-training representation learning.
We extend traditional hypothesis-testing method to validate the hypothesis on the weight similarity of neural networks.
- Score: 58.401504709365284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a weight similarity measure method that can quantify the weight
similarity of non-convex neural networks. To understand the weight similarity
of different trained models, we propose to extract the feature representation
from the weights of neural networks. We first normalize the weights of neural
networks by introducing a chain normalization rule, which is used for weight
representation learning and weight similarity measure. We extend the
traditional hypothesis-testing method to a hypothesis-training-testing
statistical inference method to validate the hypothesis on the weight
similarity of neural networks. With the chain normalization rule and the new
statistical inference, we study the weight similarity measure on Multi-Layer
Perceptron (MLP), Convolutional Neural Network (CNN), and Recurrent Neural
Network (RNN), and find that the weights of an identical neural network
optimized with the Stochastic Gradient Descent (SGD) algorithm converge to a
similar local solution in a metric space. The weight similarity measure
provides more insight into the local solutions of neural networks. Experiments
on several datasets consistently validate the hypothesis of weight similarity
measure.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Improved Generalization of Weight Space Networks via Augmentations [53.87011906358727]
Learning in deep weight spaces (DWS) is an emerging research direction, with applications to 2D and 3D neural fields (INRs, NeRFs)
We empirically analyze the reasons for this overfitting and find that a key reason is the lack of diversity in DWS datasets.
To address this, we explore strategies for data augmentation in weight spaces and propose a MixUp method adapted for weight spaces.
arXiv Detail & Related papers (2024-02-06T15:34:44Z) - Probabilistic Weight Fixing: Large-scale training of neural network
weight uncertainties for quantization [7.2282857478457805]
Weight-sharing quantization has emerged as a technique to reduce energy expenditure during inference in large neural networks.
This paper proposes a probabilistic framework based on Bayesian neural networks (BNNs) and a variational relaxation to identify which weights can be moved to which cluster centre.
Our method outperforms the state-of-the-art quantization method top-1 accuracy by 1.6% on ImageNet using DeiT-Tiny.
arXiv Detail & Related papers (2023-09-24T08:04:28Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Tensor-Train Networks for Learning Predictive Modeling of
Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications.
We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation.
An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z) - A Greedy Algorithm for Quantizing Neural Networks [4.683806391173103]
We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks.
Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required.
arXiv Detail & Related papers (2020-10-29T22:53:10Z) - Measurement error models: from nonparametric methods to deep neural
networks [3.1798318618973362]
We propose an efficient neural network design for estimating measurement error models.
We use a fully connected feed-forward neural network to approximate the regression function $f(x)$.
We conduct an extensive numerical study to compare the neural network approach with classical nonparametric methods.
arXiv Detail & Related papers (2020-07-15T06:05:37Z) - Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights.
Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.