Related papers: Self-similarity Analysis in Deep Neural Networks

Self-similarity Analysis in Deep Neural Networks

URL: http://arxiv.org/abs/2507.17785v1
Date: Wed, 23 Jul 2025 09:01:53 GMT
Title: Self-similarity Analysis in Deep Neural Networks
Authors: Jingyi Ding, Chengwen Qi, Hongfei Wang, Jianshe Wu, Licheng Jiao, Yuwei Guo, Jian Gao,
Abstract summary: Current research has found that some deep neural networks exhibit strong hierarchical self-similarity in feature representation or parameter distribution.<n>This paper proposes a complex network method based on the output features of hidden-layer neurons to investigate the self-similarity of feature networks constructed at different hidden layers.
Score: 33.89368443432198
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current research has found that some deep neural networks exhibit strong hierarchical self-similarity in feature representation or parameter distribution. However, aside from preliminary studies on how the power-law distribution of weights across different training stages affects model performance,there has been no quantitative analysis on how the self-similarity of hidden space geometry influences model weight optimization, nor is there a clear understanding of the dynamic behavior of internal neurons. Therefore, this paper proposes a complex network modeling method based on the output features of hidden-layer neurons to investigate the self-similarity of feature networks constructed at different hidden layers, and analyzes how adjusting the degree of self-similarity in feature networks can enhance the classification performance of deep neural networks. Validated on three types of networks MLP architectures, convolutional networks, and attention architectures this study reveals that the degree of self-similarity exhibited by feature networks varies across different model architectures. Furthermore, embedding constraints on the self-similarity of feature networks during the training process can improve the performance of self-similar deep neural networks (MLP architectures and attention architectures) by up to 6 percentage points.

Related papers

On Learnable Parameters of Optimal and Suboptimal Deep Learning Models [2.889799048595314]
We study the structural and operational aspects of deep learning models. Our research focuses on the nuances of learnable parameters (weight) statistics, distribution, node interaction, and visualization.
arXiv Detail & Related papers (2024-08-21T15:50:37Z)
Convergence Analysis for Deep Sparse Coding via Convolutional Neural Networks [7.956678963695681]
We explore intersections between sparse coding and deep learning to enhance our understanding of feature extraction capabilities.<n>We derive convergence rates for convolutional neural networks (CNNs) in their ability to extract sparse features.<n>Inspired by the strong connection between sparse coding and CNNs, we explore training strategies to encourage neural networks to learn more sparse features.
arXiv Detail & Related papers (2024-08-10T12:43:55Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
Multiobjective Evolutionary Pruning of Deep Neural Networks with Transfer Learning for improving their Performance and Robustness [15.29595828816055]
This work proposes MO-EvoPruneDeepTL, a multi-objective evolutionary pruning algorithm. We use Transfer Learning to adapt the last layers of Deep Neural Networks, by replacing them with sparse layers evolved by a genetic algorithm. Experiments show that our proposal achieves promising results in all the objectives, and direct relation are presented.
arXiv Detail & Related papers (2023-02-20T19:33:38Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z)
Joint Learning of Neural Transfer and Architecture Adaptation for Image Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset. In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness. Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Deep Randomized Neural Networks [12.333836441649343]
Randomized Neural Networks explore the behavior of neural systems where the majority of connections are fixed. This chapter surveys all the major aspects regarding the design and analysis of Randomized Neural Networks.
arXiv Detail & Related papers (2020-02-27T17:57:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.