Canvas: End-to-End Kernel Architecture Search in Neural Networks
- URL: http://arxiv.org/abs/2304.07741v2
- Date: Tue, 18 Apr 2023 06:02:01 GMT
- Title: Canvas: End-to-End Kernel Architecture Search in Neural Networks
- Authors: Chenggang Zhao, Genghan Zhang, Mingyu Gao
- Abstract summary: We build an end-to-end framework, Canvas, to find high-quality kernels as convolution replacements.
We show that Canvas average 1.5x speedups compared to the previous state-of-the-art with acceptable accuracy loss and search efficiency.
- Score: 1.1612831901050744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The demands for higher performance and accuracy in neural networks (NNs)
never end. Existing tensor compilation and Neural Architecture Search (NAS)
techniques orthogonally optimize the two goals but actually share many
similarities in their concrete strategies. We exploit such opportunities by
combining the two into one and make a case for Kernel Architecture Search
(KAS). KAS reviews NAS from a system perspective and zooms into a more
fine-grained level to generate neural kernels with both high performance and
good accuracy. To demonstrate the potential of KAS, we build an end-to-end
framework, Canvas, to find high-quality kernels as convolution replacements.
Canvas samples from a rich set of fine-grained primitives to stochastically and
iteratively construct new kernels and evaluate them according to user-specified
constraints. Canvas supports freely adjustable tensor dimension sizes inside
the kernel and uses two levels of solvers to satisfy structural legality and
fully utilize model budgets. The evaluation shows that by replacing standard
convolutions with generated new kernels in common NNs, Canvas achieves average
1.5x speedups compared to the previous state-of-the-art with acceptable
accuracy loss and search efficiency. Canvas verifies the practicability of KAS
by rediscovering many manually designed kernels in the past and producing new
structures that may inspire future machine learning innovations. For source
code and implementation, we open-sourced Canvas at
https://github.com/tsinghua-ideal/Canvas.
Related papers
- Accelerating Machine Learning Primitives on Commodity Hardware [0.0]
We present an extensive study of the Sliding Window convolution technique as a more efficient alternative to the commonly used General Matrix multiplication (GEMM) based convolution in Deep Neural Networks (DNNs)
Our results suggest that the Sliding Window computation kernels can outperform GEMM-based convolution on a CPU and even on dedicated hardware accelerators.
This could promote a wider adoption of AI on low-power and low-memory devices without the need for specialized hardware.
arXiv Detail & Related papers (2023-10-08T16:26:18Z) - Compacting Binary Neural Networks by Sparse Kernel Selection [58.84313343190488]
This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
arXiv Detail & Related papers (2023-03-25T13:53:02Z) - Incorporating Prior Knowledge into Neural Networks through an Implicit
Composite Kernel [1.6383321867266318]
Implicit Composite Kernel (ICK) is a kernel that combines a kernel implicitly defined by a neural network with a second kernel function chosen to model known properties.
We demonstrate ICK's superior performance and flexibility on both synthetic and real-world data sets.
arXiv Detail & Related papers (2022-05-15T21:32:44Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - FlexConv: Continuous Kernel Convolutions with Differentiable Kernel
Sizes [34.90912459206022]
Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice.
We propose FlexConv, a novel convolutional operation with which high bandwidth convolutional kernels of learnable kernel size can be learned at a fixed parameter cost.
arXiv Detail & Related papers (2021-10-15T12:35:49Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments.
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z) - Finite Versus Infinite Neural Networks: an Empirical Study [69.07049353209463]
kernel methods outperform fully-connected finite-width networks.
Centered and ensembled finite networks have reduced posterior variance.
Weight decay and the use of a large learning rate break the correspondence between finite and infinite networks.
arXiv Detail & Related papers (2020-07-31T01:57:47Z) - Neural Kernels Without Tangents [34.527798084824575]
We present an algebra for creating "compositional" kernels from bags of features.
We show that these operations correspond to many of the building blocks of "neural tangent kernels (NTK)"
arXiv Detail & Related papers (2020-03-04T18:25:41Z) - PolyScientist: Automatic Loop Transformations Combined with Microkernels
for Optimization of Deep Learning Primitives [55.79741270235602]
We develop a hybrid solution to the development of deep learning kernels.
We use the advanced polyhedral technology to automatically tune the outer loops for performance.
arXiv Detail & Related papers (2020-02-06T08:02:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.