NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications
- URL: http://arxiv.org/abs/2403.11251v1
- Date: Sun, 17 Mar 2024 15:51:21 GMT
- Title: NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications
- Authors: Vladimir Korviakov, Denis Koposov,
- Abstract summary: We propose a novel foundation operation - NeoCell - which learns matrix patterns and performs patchwise matrix multiplications with the input data.
The main advantages of the proposed operator are (1) simple implementation without need in operations like im2col, (2) low computational complexity (especially for large matrices) and (3) simple and flexible implementation of up-/down-sampling.
We validate NeoNeXt family of models based on this operation on ImageNet-1K classification task and show that they achieve competitive quality.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most of the computer vision architectures nowadays are built upon the well-known foundation operations: fully-connected layers, convolutions and multi-head self-attention blocks. In this paper we propose a novel foundation operation - NeoCell - which learns matrix patterns and performs patchwise matrix multiplications with the input data. The main advantages of the proposed operator are (1) simple implementation without need in operations like im2col, (2) low computational complexity (especially for large matrices) and (3) simple and flexible implementation of up-/down-sampling. We validate NeoNeXt family of models based on this operation on ImageNet-1K classification task and show that they achieve competitive quality.
Related papers
- Unified Sparse-Matrix Representations for Diverse Neural Architectures [0.0]
We introduce a unified matrix-order framework that casts convolutional, recurrent and self-attention operations as sparse matrix multiplications.<n>This work establishes a mathematically rigorous substrate for diverse neural architectures and opens avenues for principled, hardware-aware network design.
arXiv Detail & Related papers (2025-05-11T06:26:34Z) - Layer-Specific Optimization: Sensitivity Based Convolution Layers Basis Search [0.0]
We propose a new way of applying the matrix decomposition with respect to the weights of convolutional layers.
The essence of the method is to train not all convolutions, but only the subset of convolutions (basis convolutions) and represent the rest as linear combinations of the basis ones.
Experiments on models from the ResNet family and the CIFAR-10 dataset demonstrate that basis convolutions can not only reduce the size of the model but also accelerate the forward and backward passes of the network.
arXiv Detail & Related papers (2024-08-12T09:24:48Z) - Compute Better Spent: Replacing Dense Layers with Structured Matrices [77.61728033234233]
We identify more efficient alternatives to dense matrices, as exemplified by the success of convolutional networks in the image domain.
We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance.
We propose a novel matrix family containing Monarch matrices, the Block-Train, which we show performs better than dense for the same compute on multiple tasks.
arXiv Detail & Related papers (2024-06-10T13:25:43Z) - Multilinear Operator Networks [60.7432588386185]
Polynomial Networks is a class of models that does not require activation functions.
We propose MONet, which relies solely on multilinear operators.
arXiv Detail & Related papers (2024-01-31T16:52:19Z) - The Lattice Overparametrization Paradigm for the Machine Learning of
Lattice Operators [0.0]
We discuss a learning paradigm in which, by overparametrizing a class via elements in a lattice, an algorithm for minimizing functions in a lattice is applied to learn.
This learning paradigm has three properties that modern methods based on neural networks lack: control, transparency and interpretability.
arXiv Detail & Related papers (2023-10-10T14:00:03Z) - Low-complexity Approximate Convolutional Neural Networks [1.7368964547487395]
We present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet)
The idea is to approximate all elements of a given ConvNet with efficient approximations capable of extreme reductions in computational complexity.
Such low-complexity structures pave the way for low-power, efficient hardware designs.
arXiv Detail & Related papers (2022-07-29T21:59:29Z) - Graph Kernel Neural Networks [53.91024360329517]
We propose to use graph kernels, i.e. kernel functions that compute an inner product on graphs, to extend the standard convolution operator to the graph domain.
This allows us to define an entirely structural model that does not require computing the embedding of the input graph.
Our architecture allows to plug-in any type of graph kernels and has the added benefit of providing some interpretability.
arXiv Detail & Related papers (2021-12-14T14:48:08Z) - What if Neural Networks had SVDs? [66.91160214071088]
Various Neural Networks employ time-consuming matrix operations like matrix inversion.
We present an algorithm that is fast enough to speed up several matrix operations.
arXiv Detail & Related papers (2020-09-29T12:58:52Z) - Matrix Shuffle-Exchange Networks for Hard 2D Tasks [2.4493299476776778]
Matrix Shuffle-Exchange network can efficiently exploit long-range dependencies in 2D data.
It has comparable speed to a convolutional neural network.
arXiv Detail & Related papers (2020-06-29T09:38:54Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.