Related papers: PECAN: A Product-Quantized Content Addressable Memory Network

PECAN: A Product-Quantized Content Addressable Memory Network

URL: http://arxiv.org/abs/2208.13571v1
Date: Sat, 13 Aug 2022 08:33:56 GMT
Title: PECAN: A Product-Quantized Content Addressable Memory Network
Authors: Jie Ran, Rui Lin, Jason Chun Lok Li, Jiajun Zhou, Ngai Wong
Abstract summary: The filtering and linear transform are realized solely with product quantization (PQ) This results in a natural implementation via content addressable memory (CAM) Experiments confirm the feasibility of such Product-Quantized Content Addressable Memory Network (PECAN)
Score: 6.530758154165138
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A novel deep neural network (DNN) architecture is proposed wherein the filtering and linear transform are realized solely with product quantization (PQ). This results in a natural implementation via content addressable memory (CAM), which transcends regular DNN layer operations and requires only simple table lookup. Two schemes are developed for the end-to-end PQ prototype training, namely, through angle- and distance-based similarities, which differ in their multiplicative and additive natures with different complexity-accuracy tradeoffs. Even more, the distance-based scheme constitutes a truly multiplier-free DNN solution. Experiments confirm the feasibility of such Product-Quantized Content Addressable Memory Network (PECAN), which has strong implication on hardware-efficient deployments especially for in-memory computing.

Related papers

Input Specific Neural Networks [0.0]
The black-box of neural networks limits the ability to encode or impose specific relationships between inputs and outputs. This paper presents two ISNNs, along with equations for first second derivatives of respect to the inputs. We show how ISNNs can be used to learn structural relationships between inputs outputs via a binary gating mechanism.
arXiv Detail & Related papers (2025-03-01T00:57:16Z)
Unifying Dimensions: A Linear Adaptive Approach to Lightweight Image Super-Resolution [6.857919231112562]
Window-based transformers have demonstrated outstanding performance in super-resolution tasks. They exhibit higher computational complexity and inference latency than convolutional neural networks. We construct a convolution-based Transformer framework named the linear adaptive mixer network (LAMNet)
arXiv Detail & Related papers (2024-09-26T07:24:09Z)
Algorithm-Hardware Co-Design of Distribution-Aware Logarithmic-Posit Encodings for Efficient DNN Inference [4.093167352780157]
We introduce Logarithmic Posits (LP), an adaptive, hardware-friendly data type inspired by posits. We also develop a novel genetic-algorithm based framework, LP Quantization (LPQ), to find optimal layer-wise LP parameters.
arXiv Detail & Related papers (2024-03-08T17:28:49Z)
Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
Training Integer-Only Deep Recurrent Neural Networks [3.1829446824051195]
We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN) Our approach supports layer normalization, attention, and an adaptive piecewise linear (PWL) approximation of activation functions. The proposed method enables RNN-based language models to run on edge devices with $2times$ improvement in runtime.
arXiv Detail & Related papers (2022-12-22T15:22:36Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
Dynamic Clone Transformer for Efficient Convolutional Neural Netwoks [0.0]
We introduce a novel concept termed multi-path fully connected pattern (MPFC) to rethink the interdependencies of topology pattern, accuracy and efficiency for ConvNets. Inspired by MPFC, we propose a dual-branch module named dynamic clone transformer (DCT) where one branch generates multiple replicas from inputs and another branch reforms those clones through a series of difference vectors conditional on inputs itself to produce more variants.
arXiv Detail & Related papers (2021-06-12T13:42:28Z)
Joint Deep Reinforcement Learning and Unfolding: Beam Selection and Precoding for mmWave Multiuser MIMO with Lens Arrays [54.43962058166702]
millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) systems with discrete lens arrays have received great attention. In this work, we investigate the joint design of a beam precoding matrix for mmWave MU-MIMO systems with DLA.
arXiv Detail & Related papers (2021-01-05T03:55:04Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
Computational optimization of convolutional neural networks using separated filters architecture [69.73393478582027]
We consider a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing. Use of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding.
arXiv Detail & Related papers (2020-02-18T17:42:13Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.