Related papers: In-Memory Learning Automata Architecture using Y-Flash Cell

In-Memory Learning Automata Architecture using Y-Flash Cell

URL: http://arxiv.org/abs/2408.09456v1
Date: Sun, 18 Aug 2024 12:31:54 GMT
Title: In-Memory Learning Automata Architecture using Y-Flash Cell
Authors: Omar Ghazal, Tian Lan, Shalman Ojukwu, Komal Krishnamurthy, Alex Yakovlev, Rishad Shafik,
Abstract summary: In-memory computing, primarily through memristor-based analog computing, offers a promising solution to overcome this von Neumann bottleneck. Here, we introduce a novel approach that utilizes floating-gate Y-Flash memristive devices manufactured with a standard 180 nm CMOS process. This paper uses a new machine learning algorithm, the Tsetlin Machine (TM), for in-memory processing architecture.
Score: 13.901548326102784
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The modern implementation of machine learning architectures faces significant challenges due to frequent data transfer between memory and processing units. In-memory computing, primarily through memristor-based analog computing, offers a promising solution to overcome this von Neumann bottleneck. In this technology, data processing and storage are located inside the memory. Here, we introduce a novel approach that utilizes floating-gate Y-Flash memristive devices manufactured with a standard 180 nm CMOS process. These devices offer attractive features, including analog tunability and moderate device-to-device variation; such characteristics are essential for reliable decision-making in ML applications. This paper uses a new machine learning algorithm, the Tsetlin Machine (TM), for in-memory processing architecture. The TM's learning element, Automaton, is mapped into a single Y-Flash cell, where the Automaton's range is transferred into the Y-Flash's conductance scope. Through comprehensive simulations, the proposed hardware implementation of the learning automata, particularly for Tsetlin machines, has demonstrated enhanced scalability and on-edge learning capabilities.

Related papers

What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips [34.52960723566363]
Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware.<n>This review surveys emerging photonic hardware optimized for next-generation generative AI computing.
arXiv Detail & Related papers (2025-05-09T05:19:14Z)
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training [16.821900475733102]
Tin-Tin is an integer-based on-device training framework for low-power microcontrollers (MCUs) We introduce novel integer rescaling techniques to efficiently manage dynamic ranges and facilitate efficient weight updates. We validate the effectiveness of Tin-Tin through end-to-end application examples on real-world tiny devices.
arXiv Detail & Related papers (2025-04-13T02:21:24Z)
IMPACT:InMemory ComPuting Architecture Based on Y-FlAsh Technology for Coalesced Tsetlin Machine Inference [3.2006458716397788]
We present the IMPACT: InMemory ComPuting Architecture Based on Y-FlAsh Technology for Coalesced Tsetlin Machine Inference. Y-Flash devices have recently been demonstrated for digital and analog memory applications, offering high yield, non-volatility, and low power consumption.
arXiv Detail & Related papers (2024-12-04T12:22:52Z)
Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference [2.9302211589186244]
Large language models (LLMs) have transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations. Developments in computing and memory capabilities are lagging behind, exacerbated by the discontinuation of Moore's law. compute-in-memory (CIM) technologies offer a promising solution for accelerating AI inference by directly performing analog computations in memory.
arXiv Detail & Related papers (2024-06-12T16:57:58Z)
Efficient and accurate neural field reconstruction using resistive memory [52.68088466453264]
Traditional signal reconstruction methods on digital computers face both software and hardware challenges. We propose a systematic approach with software-hardware co-optimizations for signal reconstruction from sparse inputs. This work advances the AI-driven signal restoration technology and paves the way for future efficient and robust medical AI and 3D vision applications.
arXiv Detail & Related papers (2024-04-15T09:33:09Z)
Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model [55.116403765330084]
Current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. We propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros.
arXiv Detail & Related papers (2024-04-08T16:34:35Z)
Random resistive memory-based deep extreme point learning machine for unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM) Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z)
In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z)
FLEdge: Benchmarking Federated Machine Learning Applications in Edge Computing Systems [61.335229621081346]
Federated Learning (FL) has become a viable technique for realizing privacy-enhancing distributed deep learning on the network edge. In this paper, we propose FLEdge, which complements existing FL benchmarks by enabling a systematic evaluation of client capabilities.
arXiv Detail & Related papers (2023-06-08T13:11:20Z)
IMBUE: In-Memory Boolean-to-CUrrent Inference ArchitecturE for Tsetlin Machines [5.6634493664726495]
In-memory computing for Machine Learning (ML) applications remedies the von Neumann bottlenecks by organizing computation to exploit parallelism and locality. Non-volatile memory devices such as Resistive RAM (ReRAM) offer integrated switching and storage capabilities showing promising performance for ML applications. This paper proposes an In-Memory Boolean-to-Current Inference Architecture (IMBUE) that uses ReRAM-transistor cells to eliminate the need for such conversions.
arXiv Detail & Related papers (2023-05-22T10:55:01Z)
A Co-design view of Compute in-Memory with Non-Volatile Elements for Neural Networks [12.042322495445196]
We discuss how compute-in-memory can play an important part in the next generation of computing hardware. A non-volatile memory based cross-bar architecture forms the heart of an engine that uses an analog process to parallelize the matrix vector multiplication operation. The cross-bar architecture, at times referred to as a neuromorphic approach, can be a key hardware element in future computing machines.
arXiv Detail & Related papers (2022-06-03T15:59:46Z)
In-memory Implementation of On-chip Trainable and Scalable ANN for AI/ML Applications [0.0]
This paper presents an in-memory computing architecture for ANN enabling artificial intelligence (AI) and machine learning (ML) applications. Our novel on-chip training and inference in-memory architecture reduces energy cost and enhances throughput by simultaneously accessing the multiple rows of array per precharge cycle. The proposed architecture was trained and tested on the IRIS dataset which exhibits $46times$ more energy efficient per MAC (multiply and accumulate) operation compared to earlier classifiers.
arXiv Detail & Related papers (2020-05-19T15:36:39Z)
One-step regression and classification with crosspoint resistive memory arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge. One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition. Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.