Analytic Convolutional Layer: A Step to Analytic Neural Network
        - URL: http://arxiv.org/abs/2407.06087v1
 - Date: Wed, 3 Jul 2024 07:10:54 GMT
 - Title: Analytic Convolutional Layer: A Step to Analytic Neural Network
 - Authors: Jingmao Cui, Donglai Tao, Linmi Tao, Ruiyang Liu, Yu Cheng, 
 - Abstract summary: Analytic Convolutional Layer (ACL) is a mosaic of analytical convolution kernels (ACKs) and traditional convolution kernels.
ACLs offer a means for neural network interpretation, thereby paving the way for the intrinsic interpretability of neural network.
 - Score: 15.596391258983463
 - License: http://creativecommons.org/licenses/by/4.0/
 - Abstract:   The prevailing approach to embedding prior knowledge within convolutional layers typically includes the design of steerable kernels or their modulation using designated kernel banks. In this study, we introduce the Analytic Convolutional Layer (ACL), an innovative model-driven convolutional layer, which is a mosaic of analytical convolution kernels (ACKs) and traditional convolution kernels. ACKs are characterized by mathematical functions governed by analytic kernel parameters (AKPs) learned in training process. Learnable AKPs permit the adaptive update of incorporated knowledge to align with the features representation of data. Our extensive experiments demonstrate that the ACLs not only have a remarkable capacity for feature representation with a reduced number of parameters but also attain increased reliability through the analytical formulation of ACKs. Furthermore, ACLs offer a means for neural network interpretation, thereby paving the way for the intrinsic interpretability of neural network. The source code will be published in company with the paper. 
 
       
      
        Related papers
        - Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width   Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework.
We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values.
This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv  Detail & Related papers  (2025-03-12T17:33:13Z) - Convergence Analysis for Deep Sparse Coding via Convolutional Neural   Networks [7.956678963695681]
We introduce a novel class of Deep Sparse Coding (DSC) models.
We derive convergence rates for CNNs in their ability to extract sparse features.
Inspired by the strong connection between sparse coding and CNNs, we explore training strategies to encourage neural networks to learn more sparse features.
arXiv  Detail & Related papers  (2024-08-10T12:43:55Z) - Characterizing out-of-distribution generalization of neural networks:   application to the disordered Su-Schrieffer-Heeger model [38.79241114146971]
We show how interpretability methods can increase trust in predictions of a neural network trained to classify quantum phases.
In particular, we show that we can ensure better out-of-distribution generalization in the complex classification problem.
This work is an example of how the systematic use of interpretability methods can improve the performance of NNs in scientific problems.
arXiv  Detail & Related papers  (2024-06-14T13:24:32Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv  Detail & Related papers  (2023-02-01T03:18:07Z) - Incorporating Prior Knowledge into Neural Networks through an Implicit
  Composite Kernel [1.6383321867266318]
Implicit Composite Kernel (ICK) is a kernel that combines a kernel implicitly defined by a neural network with a second kernel function chosen to model known properties.
We demonstrate ICK's superior performance and flexibility on both synthetic and real-world data sets.
arXiv  Detail & Related papers  (2022-05-15T21:32:44Z) - Rapid training of deep neural networks without skip connections or
  normalization layers using Deep Kernel Shaping [46.083745557823164]
We identify the main pathologies present in deep networks that prevent them from training fast and generalizing to unseen data.
We show how these can be avoided by carefully controlling the "shape" of the network's kernel function.
arXiv  Detail & Related papers  (2021-10-05T00:49:36Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
  Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv  Detail & Related papers  (2021-03-17T08:28:30Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
  Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
 Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv  Detail & Related papers  (2020-09-24T17:48:59Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
  Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv  Detail & Related papers  (2020-07-03T01:37:16Z) - Learning to Learn Kernels with Variational Random Features [118.09565227041844]
We introduce kernels with random Fourier features in the meta-learning framework to leverage their strong few-shot learning ability.
We formulate the optimization of MetaVRF as a variational inference problem.
We show that MetaVRF delivers much better, or at least competitive, performance compared to existing meta-learning alternatives.
arXiv  Detail & Related papers  (2020-06-11T18:05:29Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv  Detail & Related papers  (2020-02-20T15:43:02Z) - Fast Estimation of Information Theoretic Learning Descriptors using
  Explicit Inner Product Spaces [4.5497405861975935]
Kernel methods form a theoretically-grounded, powerful and versatile framework to solve nonlinear problems in signal processing and machine learning.
Recently, we proposed emphno-trick (NT) kernel adaptive filtering (KAF)
We focus on a family of fast, scalable, and accurate estimators for ITL using explicit inner product space kernels.
arXiv  Detail & Related papers  (2020-01-01T20:21:12Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.