Dense Associative Memories with Analog Circuits
- URL: http://arxiv.org/abs/2512.15002v1
- Date: Wed, 17 Dec 2025 01:22:44 GMT
- Title: Dense Associative Memories with Analog Circuits
- Authors: Marc Gong Bacvanski, Xincheng You, John Hopfield, Dmitry Krotov,
- Abstract summary: We propose a general method for building analog accelerators for DenseAMs.<n>We find that analog DenseAM hardware performs inference in constant time independent of model size.<n>We estimate lower bounds on the achievable time constants imposed by amplifier specifications, suggesting that even conservative existing analog technology can enable inference times on the order of tens to hundreds of nanoseconds.
- Score: 4.0086293309536405
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing computational demands of modern AI systems have exposed fundamental limitations of digital hardware, driving interest in alternative paradigms for efficient large-scale inference. Dense Associative Memory (DenseAM) is a family of models that offers a flexible framework for representing many contemporary neural architectures, such as transformers and diffusion models, by casting them as dynamical systems evolving on an energy landscape. In this work, we propose a general method for building analog accelerators for DenseAMs and implementing them using electronic RC circuits, crossbar arrays, and amplifiers. We find that our analog DenseAM hardware performs inference in constant time independent of model size. This result highlights an asymptotic advantage of analog DenseAMs over digital numerical solvers that scale at least linearly with the model size. We consider three settings of progressively increasing complexity: XOR, the Hamming (7,4) code, and a simple language model defined on binary variables. We propose analog implementations of these three models and analyze the scaling of inference time, energy consumption, and hardware. Finally, we estimate lower bounds on the achievable time constants imposed by amplifier specifications, suggesting that even conservative existing analog technology can enable inference times on the order of tens to hundreds of nanoseconds. By harnessing the intrinsic parallelism and continuous-time operation of analog circuits, our DenseAM-based accelerator design offers a new avenue for fast and scalable AI hardware.
Related papers
- Hardware-Aware Model Design and Training of Silicon-based Analog Neural Networks [33.83993649730681]
We show that by retraining the neural network using a physics-informed hardware-aware model one can fully recover the inference accuracy of the ideal network model.<n>This is more promising for scalability and integration density than the default option of improving the fidelity of the analog neural network.
arXiv Detail & Related papers (2025-12-08T10:11:13Z) - A2R: An Asymmetric Two-Stage Reasoning Framework for Parallel Reasoning [57.727084580884075]
Asymmetric Two-Stage Reasoning framework designed to bridge gap between a model's potential and its actual performance.<n>A2R-Efficient is a "small-to-big" variant that combines a Qwen3-4B explorer with a Qwen3-8B synthesizer.<n>Results show A2R is not only a performance-boosting framework but also an efficient and practical solution for real-world applications.
arXiv Detail & Related papers (2025-09-26T08:27:03Z) - Sequential-Parallel Duality in Prefix Scannable Models [68.39855814099997]
Recent developments have given rise to various models, such as Gated Linear Attention (GLA) and Mamba.<n>This raises a natural question: can we characterize the full class of neural sequence models that support near-constant-time parallel evaluation and linear-time, constant-space sequential inference?
arXiv Detail & Related papers (2025-06-12T17:32:02Z) - Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent [5.429033337081392]
We propose a novel analog accelerator architecture for AI/ML training workloads using gradient descent with L2 regularization (SGDr)<n>The proposed design achieves significant reductions in transistor area and power consumption compared to digital implementations.<n>This work paves the way for energy-efficient analog AI hardware with on-chip training capabilities.
arXiv Detail & Related papers (2025-01-22T19:26:36Z) - ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation [83.62931466231898]
This paper presents ARLON, a framework that boosts diffusion Transformers with autoregressive models for long video generation.<n>A latent Vector Quantized Variational Autoencoder (VQ-VAE) compresses the input latent space of the DiT model into compact visual tokens.<n>An adaptive norm-based semantic injection module integrates the coarse discrete visual units from the AR model into the DiT model.
arXiv Detail & Related papers (2024-10-27T16:28:28Z) - Towards training digitally-tied analog blocks via hybrid gradient computation [1.800676987432211]
We introduce Feedforward-tied Energy-based Models (ff-EBMs)
We derive a novel algorithm to compute gradients end-to-end in ff-EBMs by backpropagating and "eq-propagating" through feedforward and energy-based parts respectively.
Our approach offers a principled, scalable, and incremental roadmap to gradually integrate self-trainable analog computational primitives into existing digital accelerators.
arXiv Detail & Related papers (2024-09-05T07:22:19Z) - Comparative Study of State-based Neural Networks for Virtual Analog Audio Effects Modeling [0.0]
We explore the application of recent machine learning advancements for virtual analog modeling.<n>We compare State-Space models and Linear Recurrent Units against the more common LSTM networks.<n>Our metrics aim to assess the models' ability to accurately replicate the signal's energy and frequency contents.
arXiv Detail & Related papers (2024-05-07T08:47:40Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Synaptogen: A cross-domain generative device model for large-scale neuromorphic circuit design [1.704443882665726]
We present a fast generative modeling approach for resistive memories that reproduces the complex statistical properties of real-world devices.
By training on extensive measurement data of integrated 1T1R arrays, an autoregressive process accurately accounts for the cross-correlations between the parameters.
Benchmarks show that this statistically comprehensive model read/writes throughput exceeds those of even highly simplified and deterministic compact models.
arXiv Detail & Related papers (2024-04-09T14:33:03Z) - AnalogNAS: A Neural Network Design Framework for Accurate Inference with
Analog In-Memory Computing [7.596833322764203]
Inference at the edge requires low latency, compact and power-efficient models.
analog/mixed signal in-memory computing hardware accelerators can easily transcend the memory wall of von Neuman architectures.
We propose AnalogNAS, a framework for automated Deep Neural Network (DNN) design targeting deployment on analog In-Memory Computing (IMC) inference accelerators.
arXiv Detail & Related papers (2023-05-17T07:39:14Z) - TMS: A Temporal Multi-scale Backbone Design for Speaker Embedding [60.292702363839716]
Current SOTA backbone networks for speaker embedding are designed to aggregate multi-scale features from an utterance with multi-branch network architectures for speaker representation.
We propose an effective temporal multi-scale (TMS) model where multi-scale branches could be efficiently designed in a speaker embedding network almost without increasing computational costs.
arXiv Detail & Related papers (2022-03-17T05:49:35Z) - Automated and Formal Synthesis of Neural Barrier Certificates for
Dynamical Models [70.70479436076238]
We introduce an automated, formal, counterexample-based approach to synthesise Barrier Certificates (BC)
The approach is underpinned by an inductive framework, which manipulates a candidate BC structured as a neural network, and a sound verifier, which either certifies the candidate's validity or generates counter-examples.
The outcomes show that we can synthesise sound BCs up to two orders of magnitude faster, with in particular a stark speedup on the verification engine.
arXiv Detail & Related papers (2020-07-07T07:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.