Hardware Implementation of Hyperbolic Tangent Function using Catmull-Rom
Spline Interpolation
- URL: http://arxiv.org/abs/2007.13516v1
- Date: Mon, 13 Jul 2020 07:11:59 GMT
- Title: Hardware Implementation of Hyperbolic Tangent Function using Catmull-Rom
Spline Interpolation
- Authors: Mahesh Chandra
- Abstract summary: Deep neural networks yield the state of the art results in many computer vision and human machine interface tasks such as object recognition, speech recognition etc.
Since, these networks are computationally expensive, customized accelerators are designed for achieving the required performance at lower cost and power.
- Score: 5.429955391775968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks yield the state of the art results in many computer
vision and human machine interface tasks such as object recognition, speech
recognition etc. Since, these networks are computationally expensive,
customized accelerators are designed for achieving the required performance at
lower cost and power. One of the key building blocks of these neural networks
is non-linear activation function such as sigmoid, hyperbolic tangent (tanh),
and ReLU. A low complexity accurate hardware implementation of the activation
function is required to meet the performance and area targets of the neural
network accelerators. This paper presents an implementation of tanh function
using the Catmull-Rom spline interpolation. State of the art results are
achieved using this method with comparatively smaller logic area.
Related papers
- Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Exploring the Approximation Capabilities of Multiplicative Neural
Networks for Smooth Functions [9.936974568429173]
We consider two classes of target functions: generalized bandlimited functions and Sobolev-Type balls.
Our results demonstrate that multiplicative neural networks can approximate these functions with significantly fewer layers and neurons.
These findings suggest that multiplicative gates can outperform standard feed-forward layers and have potential for improving neural network design.
arXiv Detail & Related papers (2023-01-11T17:57:33Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Hardware Accelerator and Neural Network Co-Optimization for
Ultra-Low-Power Audio Processing Devices [0.0]
HANNAH is a framework for automated and combined hardware/software co-design of deep neural networks and hardware accelerators.
We show that HANNAH can find suitable neural networks with minimized power consumption and high accuracy for different audio classification tasks.
arXiv Detail & Related papers (2022-09-08T13:29:09Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - ZippyPoint: Fast Interest Point Detection, Description, and Matching
through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms.
ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size.
These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - ItNet: iterative neural networks with small graphs for accurate and
efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs.
We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z) - Robust error bounds for quantised and pruned neural networks [1.8083503268672914]
Machine learning algorithms are moving towards decentralisation with the data and algorithms stored, and even trained, locally on devices.
The device hardware becomes the main bottleneck for model capability in this set-up, creating a need for slimmed down, more efficient neural networks.
A semi-definite program is introduced to bound the worst-case error caused by pruning or quantising a neural network.
It is hoped that the computed bounds will provide certainty to the performance of these algorithms when deployed on safety-critical systems.
arXiv Detail & Related papers (2020-11-30T22:19:44Z) - Comparative Analysis of Polynomial and Rational Approximations of
Hyperbolic Tangent Function for VLSI Implementation [5.429955391775968]
Deep neural networks yield the state-of-the-art results in many computer vision and human machine interface applications such as object detection, speech recognition etc.
Since, these networks are computationally expensive, customized accelerators are designed for achieving the required performance at lower cost and power.
arXiv Detail & Related papers (2020-07-13T07:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.