Related papers: Brain-inspired Cognition in Next Generation Racetrack Memories

Brain-inspired Cognition in Next Generation Racetrack Memories

URL: http://arxiv.org/abs/2111.02246v1
Date: Wed, 3 Nov 2021 14:21:39 GMT
Title: Brain-inspired Cognition in Next Generation Racetrack Memories
Authors: Asif Ali Khan, Sebastien Ollivier, Stephen Longofono, Gerald Hempel, Jeronimo Castrillon and Alex K. Jones
Abstract summary: Hyperdimensional computing (HDC) is an emerging computational framework inspired by the brain that operates on vectors with thousands of dimensions to emulate cognition. This paper presents an architecture based on racetrack memory (RTM) to conduct and accelerate the entire HDC framework within the memory. The proposed solution requires minimal additional CMOS circuitry and uses a read operation across multiple domains in RTMs called transverse read (TR) to realize exclusive-or (XOR) and addition operations.
Score: 0.6850683267295249
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hyperdimensional computing (HDC) is an emerging computational framework inspired by the brain that operates on vectors with thousands of dimensions to emulate cognition. Unlike conventional computational frameworks that operate on numbers, HDC, like the brain, uses high dimensional random vectors and is capable of one-shot learning. HDC is based on a well-defined set of arithmetic operations and is highly error-resilient. The core operations of HDC manipulate HD vectors in bulk bit-wise fashion, offering many opportunities to leverage parallelism. Unfortunately, on conventional Von-Neuman architectures, the continuous movement of HD vectors among the processor and the memory can make the cognition task prohibitively slow and energy-intensive. Hardware accelerators only marginally improve related metrics. On the contrary, only partial implementation of an HDC framework inside memory, using emerging memristive devices, has reported considerable performance/energy gains. This paper presents an architecture based on racetrack memory (RTM) to conduct and accelerate the entire HDC framework within the memory. The proposed solution requires minimal additional CMOS circuitry and uses a read operation across multiple domains in RTMs called transverse read (TR) to realize exclusive-or (XOR) and addition operations. To minimize the overhead the CMOS circuitry, we propose an RTM nanowires-based counting mechanism that leverages the TR operation and the standard RTM operations. Using language recognition as the use case demonstrates 7.8x and 5.3x reduction in the overall runtime and energy consumption compared to the FPGA design, respectively. Compared to the state-of-the-art in-memory implementation, the proposed HDC system reduces the energy consumption by 8.6x.

Related papers

A Universal Framework for Compressing Embeddings in CTR Prediction [68.27582084015044]
We introduce a Model-agnostic Embedding Compression (MEC) framework that compresses embedding tables by quantizing pre-trained embeddings. Our approach consists of two stages: first, we apply popularity-weighted regularization to balance code distribution between high- and low-frequency features. Experiments on three datasets reveal that our method reduces memory usage by over 50x while maintaining or improving recommendation performance.
arXiv Detail & Related papers (2025-02-21T10:12:34Z)
MEMHD: Memory-Efficient Multi-Centroid Hyperdimensional Computing for Fully-Utilized In-Memory Computing Architectures [7.990774970571298]
MEMHD is a Memory-Efficient Multi-centroid HDC framework designed to address these challenges. Our approach achieves full utilization of IMC arrays and enables one-shot (or few-shot) associative search. MEMHD reduces computation cycles by up to 80x and array usage by up to 71x compared to baseline IMC mapping methods.
arXiv Detail & Related papers (2025-02-11T00:53:15Z)
gECC: A GPU-based high-throughput framework for Elliptic Curve Cryptography [15.39096542261856]
Elliptic Curve Cryptography (ECC) is an encryption method that provides security comparable to traditional techniques like Rivest-Shamir-Adleman (RSA) ECC is still hindered by the significant performance overhead associated with elliptic curve (EC) operations. This paper presents gECC, a versatile framework for ECC optimized for GPU architectures.
arXiv Detail & Related papers (2024-12-22T01:50:50Z)
Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders. We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z)
CHIME: Energy-Efficient STT-RAM-based Concurrent Hierarchical In-Memory Processing [1.5566524830295307]
This paper introduces a novel PiC/PiM architecture, Concurrent Hierarchical In-Memory Processing (CHIME) CHIME strategically incorporates heterogeneous compute units across multiple levels of the memory hierarchy. Experiments reveal that, compared to the state-of-the-art bit-line computing approaches, CHIME achieves significant speedup and energy savings of 57.95% and 78.23%.
arXiv Detail & Related papers (2024-07-29T01:17:54Z)
Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges. We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs. This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z)
Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z)
INR-Arch: A Dataflow Architecture and Compiler for Arbitrary-Order Gradient Computations in Implicit Neural Representation Processing [66.00729477511219]
Given a function represented as a computation graph, traditional architectures face challenges in efficiently computing its nth-order gradient. We introduce INR-Arch, a framework that transforms the computation graph of an nth-order gradient into a hardware-optimized dataflow architecture. We present results that demonstrate 1.8-4.8x and 1.5-3.6x speedup compared to CPU and GPU baselines respectively.
arXiv Detail & Related papers (2023-08-11T04:24:39Z)
HDCC: A Hyperdimensional Computing compiler for classification on embedded systems and high-performance computing [58.720142291102135]
This work introduces the name compiler, the first open-source compiler that translates high-level descriptions of HDC classification methods into optimized C code. name is designed like a modern compiler, featuring an intuitive and descriptive input language, an intermediate representation (IR), and a retargetable backend. To substantiate these claims, we conducted experiments with HDCC on several of the most popular datasets in the HDC literature.
arXiv Detail & Related papers (2023-04-24T19:16:03Z)
Pex: Memory-efficient Microcontroller Deep Learning through Partial Execution [11.336229510791481]
We discuss a novel execution paradigm for microcontroller deep learning. It modifies the execution of neural networks to avoid materialising full buffers in memory. This is achieved by exploiting the properties of operators, which can consume/produce a fraction of their input/output at a time.
arXiv Detail & Related papers (2022-11-30T18:47:30Z)
A Co-design view of Compute in-Memory with Non-Volatile Elements for Neural Networks [12.042322495445196]
We discuss how compute-in-memory can play an important part in the next generation of computing hardware. A non-volatile memory based cross-bar architecture forms the heart of an engine that uses an analog process to parallelize the matrix vector multiplication operation. The cross-bar architecture, at times referred to as a neuromorphic approach, can be a key hardware element in future computing machines.
arXiv Detail & Related papers (2022-06-03T15:59:46Z)
Understanding Hyperdimensional Computing for Parallel Single-Pass Learning [47.82940409267635]
We show that HDC can outperform the state-of-the-art HDC model by up to 7.6% while maintaining hardware efficiency. We propose a new class of VSAs, finite group VSAs, which surpass the limits of HDC. Experimental results show that our RFF method and group VSA can both outperform the state-of-the-art HDC model by up to 7.6%.
arXiv Detail & Related papers (2022-02-10T02:38:56Z)
A 5 \mu W Standard Cell Memory-based Configurable Hyperdimensional Computing Accelerator for Always-on Smart Sensing [16.589169601764297]
Hyperdimensional computing (HDC) is a brain-inspired computing paradigm based on high-dimensional holistic representations of vectors. We propose a programmable all-digital CMOS implementation of a fully autonomous HDC accelerator for always-on classification in energy-constrained sensor nodes.
arXiv Detail & Related papers (2021-02-04T17:41:29Z)
Robust High-dimensional Memory-augmented Neural Networks [13.82206983716435]
Memory-augmented neural networks enhance neural networks with an explicit memory to overcome these issues. Access to this explicit memory occurs via soft read and write operations involving every individual memory entry. We propose a robust architecture that employs a computational memory unit as the explicit memory performing analog in-memory computation on high-dimensional (HD) vectors.
arXiv Detail & Related papers (2020-10-05T12:01:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.