Dimensionality Reduction in Deep Learning via Kronecker Multi-layer
Architectures
- URL: http://arxiv.org/abs/2204.04273v1
- Date: Fri, 8 Apr 2022 19:54:52 GMT
- Title: Dimensionality Reduction in Deep Learning via Kronecker Multi-layer
Architectures
- Authors: Jarom D. Hogue and Robert M. Kirby and Akil Narayan
- Abstract summary: We propose a new deep learning architecture based on fast matrix multiplication of a Kronecker product decomposition.
We show that this architecture allows a neural network to be trained and implemented with a significant reduction in computational time and resources.
- Score: 4.836352379142503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning using neural networks is an effective technique for generating
models of complex data. However, training such models can be expensive when
networks have large model capacity resulting from a large number of layers and
nodes. For training in such a computationally prohibitive regime,
dimensionality reduction techniques ease the computational burden, and allow
implementations of more robust networks. We propose a novel type of such
dimensionality reduction via a new deep learning architecture based on fast
matrix multiplication of a Kronecker product decomposition; in particular our
network construction can be viewed as a Kronecker product-induced
sparsification of an "extended" fully connected network. Analysis and practical
examples show that this architecture allows a neural network to be trained and
implemented with a significant reduction in computational time and resources,
while achieving a similar error level compared to a traditional feedforward
neural network.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Homological Neural Networks: A Sparse Architecture for Multivariate
Complexity [0.0]
We develop a novel deep neural network unit characterized by a sparse higher-order graphical architecture built over the homological structure of underlying data.
Results demonstrate the advantages of this novel design which can tie or overcome the results of state-of-the-art machine learning and deep learning models using only a fraction of parameters.
arXiv Detail & Related papers (2023-06-27T09:46:16Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Adaptive Neural Networks Using Residual Fitting [2.546014024559691]
We present a network-growth method that searches for explainable error in the network's residuals and grows the network if sufficient error is detected.
Within these tasks, the growing network can often achieve better performance than small networks that do not grow.
arXiv Detail & Related papers (2023-01-13T19:52:30Z) - MAgNET: A Graph U-Net Architecture for Mesh-Based Simulations [0.5185522256407782]
We present MAgNET, which extends the well-known convolutional neural networks to accommodate arbitrary graph-structured data.
We demonstrate the predictive capabilities of MAgNET in surrogate modeling for non-linear finite element simulations in the mechanics of solids.
arXiv Detail & Related papers (2022-11-01T19:23:45Z) - Efficient Neural Architecture Search with Performance Prediction [0.0]
We use a neural architecture search to find the best network architecture for the task at hand.
Existing NAS algorithms generally evaluate the fitness of a new architecture by fully training from scratch.
An end-to-end offline performance predictor is proposed to accelerate the evaluation of sampled architectures.
arXiv Detail & Related papers (2021-08-04T05:44:16Z) - Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis.
We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z) - Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training.
The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.