Towards Explaining Hypercomplex Neural Networks
- URL: http://arxiv.org/abs/2403.17929v1
- Date: Tue, 26 Mar 2024 17:58:07 GMT
- Title: Towards Explaining Hypercomplex Neural Networks
- Authors: Eleonora Lopez, Eleonora Grassucci, Debora Capriotti, Danilo Comminiello,
- Abstract summary: Hypercomplex neural networks are gaining increasing interest in the deep learning community.
In this paper, we propose inherently interpretable PHNNs and quaternion-like networks.
We draw insights into how this unique branch of neural models operates.
- Score: 6.543091030789653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hypercomplex neural networks are gaining increasing interest in the deep learning community. The attention directed towards hypercomplex models originates from several aspects, spanning from purely theoretical and mathematical characteristics to the practical advantage of lightweight models over conventional networks, and their unique properties to capture both global and local relations. In particular, a branch of these architectures, parameterized hypercomplex neural networks (PHNNs), has also gained popularity due to their versatility across a multitude of application domains. Nonetheless, only few attempts have been made to explain or interpret their intricacies. In this paper, we propose inherently interpretable PHNNs and quaternion-like networks, thus without the need for any post-hoc method. To achieve this, we define a type of cosine-similarity transform within the parameterized hypercomplex domain. This PHB-cos transform induces weight alignment with relevant input features and allows to reduce the model into a single linear transform, rendering it directly interpretable. In this work, we start to draw insights into how this unique branch of neural models operates. We observe that hypercomplex networks exhibit a tendency to concentrate on the shape around the main object of interest, in addition to the shape of the object itself. We provide a thorough analysis, studying single neurons of different layers and comparing them against how real-valued networks learn. The code of the paper is available at https://github.com/ispamm/HxAI.
Related papers
- Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - PHYDI: Initializing Parameterized Hypercomplex Neural Networks as
Identity Functions [9.836302410524842]
parameterized hypercomplex neural networks (PHNNs) are growing in size and no techniques have been adopted so far to control their convergence at a large scale.
In this paper, we study PHNNs convergence and propose a method to improve their convergence at different scales.
We show the effectiveness of this approach in different benchmarks and with common PHNNs with ResNets- and Transformer-based architecture.
arXiv Detail & Related papers (2023-10-11T15:56:55Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Introduction to dynamical mean-field theory of generic random neural
networks [2.0711789781518752]
It is not easy for beginners to access the essence of this tool and the underlying physics.
We give a pedagogical introduction of this method in a particular example of generic random neural networks.
The numerical implementation of solving the integro-differential mean-field equations is also detailed.
arXiv Detail & Related papers (2023-05-15T09:01:40Z) - When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning.
Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - SPINN: Sparse, Physics-based, and Interpretable Neural Networks for PDEs [0.0]
We introduce a class of Sparse, Physics-based, and Interpretable Neural Networks (SPINN) for solving ordinary and partial differential equations.
By reinterpreting a traditional meshless representation of solutions of PDEs as a special sparse deep neural network, we develop a class of sparse neural network architectures that are interpretable.
arXiv Detail & Related papers (2021-02-25T17:45:50Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Hyperbolic Neural Networks++ [66.16106727715061]
We generalize the fundamental components of neural networks in a single hyperbolic geometry model, namely, the Poincar'e ball model.
Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.
arXiv Detail & Related papers (2020-06-15T08:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.