Related papers: Running Conventional Automatic Speech Recognition on Memristor Hardware: A Simulated Approach

Running Conventional Automatic Speech Recognition on Memristor Hardware: A Simulated Approach

URL: http://arxiv.org/abs/2505.24721v1
Date: Fri, 30 May 2025 15:42:41 GMT
Title: Running Conventional Automatic Speech Recognition on Memristor Hardware: A Simulated Approach
Authors: Nick Rossenbach, Benedikt Hilmes, Leon Brackmann, Moritz Gunz, Ralf Schlüter,
Abstract summary: We show how an ML system with millions of parameters would behave on memristor hardware.<n>We limit the relative degradation in word error rate to 25% when using a 3-bit weight precision to execute linear operations.
Score: 18.47703842449581
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Memristor-based hardware offers new possibilities for energy-efficient machine learning (ML) by providing analog in-memory matrix multiplication. Current hardware prototypes cannot fit large neural networks, and related literature covers only small ML models for tasks like MNIST or single word recognition. Simulation can be used to explore how hardware properties affect larger models, but existing software assumes simplified hardware. We propose a PyTorch-based library based on "Synaptogen" to simulate neural network execution with accurately captured memristor hardware properties. For the first time, we show how an ML system with millions of parameters would behave on memristor hardware, using a Conformer trained on the speech recognition task TED-LIUMv2 as example. With adjusted quantization-aware training, we limit the relative degradation in word error rate to 25% when using a 3-bit weight precision to execute linear operations via simulated analog computation.

Related papers

Analog Foundation Models [6.589590906512612]
Analog in-memory computing (AIMC) is a promising compute paradigm to improve speed and power efficiency of neural network computations.<n>AIMC introduces fundamental challenges such as noisy computations and strict inference on input and quantization.<n>We introduce a general scalable method to robustly adapt and execute on low-precision analog hardware.
arXiv Detail & Related papers (2025-05-14T14:52:22Z)
An Efficient Quantum Classifier Based on Hamiltonian Representations [50.467930253994155]
Quantum machine learning (QML) is a discipline that seeks to transfer the advantages of quantum computing to data-driven tasks.<n>We propose an efficient approach that circumvents the costs associated with data encoding by mapping inputs to a finite set of Pauli strings.<n>We evaluate our approach on text and image classification tasks, against well-established classical and quantum models.
arXiv Detail & Related papers (2025-04-13T11:49:53Z)
A Hybrid Transformer Architecture with a Quantized Self-Attention Mechanism Applied to Molecular Generation [0.0]
We propose a hybrid quantum-classical self-attention mechanism as part of a transformer decoder.<n>We show that the time complexity of the query-key dot product is reduced from $mathcalO(n2 d)$ in a classical model to $mathcalO(n2 d)$ in our quantum model.<n>This work provides a promising avenue for quantum-enhanced natural language processing (NLP)
arXiv Detail & Related papers (2025-02-26T15:15:01Z)
Machine Learning for Arbitrary Single-Qubit Rotations on an Embedded Device [1.3753825907341728]
We present a technique for using machine learning (ML) for single-qubit gate synthesis on field programmable logic. We first bootstrap a model based on simulation with access to the full statevector for measuring gate fidelity. We next present an algorithm, named adapted randomized benchmarking (ARB), for fine-tuning the gate on hardware based on measurements.
arXiv Detail & Related papers (2024-11-20T04:59:38Z)
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution [114.61347672265076]
Development of MLLMs for real-world robots is challenging due to the typically limited computation and memory capacities available on robotic platforms. We propose a Dynamic Early-Exit Framework for Robotic Vision-Language-Action Model (DeeR) that automatically adjusts the size of the activated MLLM. DeeR demonstrates significant reductions in computational costs of LLM by 5.2-6.5x and GPU memory of LLM by 2-6x without compromising performance.
arXiv Detail & Related papers (2024-11-04T18:26:08Z)
Validating Large-Scale Quantum Machine Learning: Efficient Simulation of Quantum Support Vector Machines Using Tensor Networks [17.80970950814512]
We present an efficient tensor-network-based approach for simulating large-scale quantum circuits.<n>Our simulator successfully handles QSVMs with up to 784 qubits, completing simulations within seconds on a single high-performance GPU.
arXiv Detail & Related papers (2024-05-04T10:37:01Z)
Quantization of Deep Neural Networks to facilitate self-correction of weights on Phase Change Memory-based analog hardware [0.0]
We develop an algorithm to approximate a set of multiplicative weights. These weights aim to represent the original network's weights with minimal loss in performance. Our results demonstrate that, when paired with an on-chip pulse generator, our self-correcting neural network performs comparably to those trained with analog-aware algorithms.
arXiv Detail & Related papers (2023-09-30T10:47:25Z)
Token Turing Machines [53.22971546637947]
Token Turing Machines (TTM) is a sequential, autoregressive Transformer model with memory for real-world sequential visual understanding. Our model is inspired by the seminal Neural Turing Machine, and has an external memory consisting of a set of tokens which summarise the previous history.
arXiv Detail & Related papers (2022-11-16T18:59:18Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability. We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z)
One-step regression and classification with crosspoint resistive memory arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge. One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition. Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.