Unveiling the Unseen: Identifiable Clusters in Trained Depthwise
Convolutional Kernels
- URL: http://arxiv.org/abs/2401.14469v1
- Date: Thu, 25 Jan 2024 19:05:53 GMT
- Title: Unveiling the Unseen: Identifiable Clusters in Trained Depthwise
Convolutional Kernels
- Authors: Zahra Babaiee, Peyman M. Kiasari, Daniela Rus, Radu Grosu
- Abstract summary: Recent advances in depthwise-separable convolutional neural networks (DS-CNNs) have led to novel architectures.
This paper reveals another striking property of DS-CNN architectures: discernible and explainable patterns emerge in their trained depthwise convolutional kernels in all layers.
- Score: 56.69755544814834
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in depthwise-separable convolutional neural networks
(DS-CNNs) have led to novel architectures, that surpass the performance of
classical CNNs, by a considerable scalability and accuracy margin. This paper
reveals another striking property of DS-CNN architectures: discernible and
explainable patterns emerge in their trained depthwise convolutional kernels in
all layers. Through an extensive analysis of millions of trained filters, with
different sizes and from various models, we employed unsupervised clustering
with autoencoders, to categorize these filters. Astonishingly, the patterns
converged into a few main clusters, each resembling the difference of Gaussian
(DoG) functions, and their first and second-order derivatives. Notably, we were
able to classify over 95\% and 90\% of the filters from state-of-the-art
ConvNextV2 and ConvNeXt models, respectively. This finding is not merely a
technological curiosity; it echoes the foundational models neuroscientists have
long proposed for the vision systems of mammals. Our results thus deepen our
understanding of the emergent properties of trained DS-CNNs and provide a
bridge between artificial and biological visual processing systems. More
broadly, they pave the way for more interpretable and biologically-inspired
neural network designs in the future.
Related papers
- Super Consistency of Neural Network Landscapes and Learning Rate Transfer [72.54450821671624]
We study the landscape through the lens of the loss Hessian.
We find that certain spectral properties under $mu$P are largely independent of the size of the network.
We show that in the Neural Tangent Kernel (NTK) and other scaling regimes, the sharpness exhibits very different dynamics at different scales.
arXiv Detail & Related papers (2024-02-27T12:28:01Z) - Deep Continuous Networks [21.849285945717632]
We propose deep continuous networks (DCNs), which combine spatially continuous filters, with the continuous depth framework of neural ODEs.
This allows us to learn the spatial support of the filters during training, as well as model the continuous evolution of feature maps, linking DCNs closely to biological models.
We show that DCNs are versatile and highly applicable to standard image classification and reconstruction problems, where they improve parameter and data efficiency, and allow for meta-parametrization.
arXiv Detail & Related papers (2024-02-02T16:50:18Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - A Gradient Boosting Approach for Training Convolutional and Deep Neural
Networks [0.0]
We introduce two procedures for training Convolutional Neural Networks (CNNs) and Deep Neural Network based on Gradient Boosting (GB)
The presented models show superior performance in terms of classification accuracy with respect to standard CNN and Deep-NN with the same architectures.
arXiv Detail & Related papers (2023-02-22T12:17:32Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Improving Neural Predictivity in the Visual Cortex with Gated Recurrent
Connections [0.0]
We aim to shift the focus on architectures that take into account lateral recurrent connections, a ubiquitous feature of the ventral visual stream, to devise adaptive receptive fields.
In order to increase the robustness of our approach and the biological fidelity of the activations, we employ specific data augmentation techniques.
arXiv Detail & Related papers (2022-03-22T17:27:22Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Inferring Convolutional Neural Networks' accuracies from their
architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance.
We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems.
We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.