Related papers: Dopant Network Processing Units: Towards Efficient Neural-network Emulators with High-capacity Nanoelectronic Nodes

Dopant Network Processing Units: Towards Efficient Neural-network Emulators with High-capacity Nanoelectronic Nodes

URL: http://arxiv.org/abs/2007.12371v2
Date: Tue, 3 Aug 2021 10:03:11 GMT
Title: Dopant Network Processing Units: Towards Efficient Neural-network Emulators with High-capacity Nanoelectronic Nodes
Authors: Hans-Christian Ruiz-Euler, Unai Alegre-Ibarra, Bram van de Ven, Hajo Broersma, Peter A. Bobbert, Wilfred G. van der Wiel
Abstract summary: "Dopant Network Processing Units" (DNPUs) are highly energy-efficient and have potentially very high throughput. We introduce DNPUs as high-capacity neurons and move from a single to a multi-neuron framework. We show that feed-forward DNPU networks improve the performance of a single DNPU from 77% to 94% test accuracy.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapidly growing computational demands of deep neural networks require novel hardware designs. Recently, tunable nanoelectronic devices were developed based on hopping electrons through a network of dopant atoms in silicon. These "Dopant Network Processing Units" (DNPUs) are highly energy-efficient and have potentially very high throughput. By adapting the control voltages applied to its terminals, a single DNPU can solve a variety of linearly non-separable classification problems. However, using a single device has limitations due to the implicit single-node architecture. This paper presents a promising novel approach to neural information processing by introducing DNPUs as high-capacity neurons and moving from a single to a multi-neuron framework. By implementing and testing a small multi-DNPU classifier in hardware, we show that feed-forward DNPU networks improve the performance of a single DNPU from 77% to 94% test accuracy on a binary classification task with concentric classes on a plane. Furthermore, motivated by the integration of DNPUs with memristor arrays, we study the potential of using DNPUs in combination with linear layers. We show by simulation that a single-layer MNIST classifier with only 10 DNPUs achieves over 96% test accuracy. Our results pave the road towards hardware neural-network emulators that offer atomic-scale information processing with low latency and energy consumption.

Related papers

CMOS Implementation of Field Programmable Spiking Neural Network for Hardware Reservoir Computing [0.0]
Large-scale neural networks, such as Deep Neural Networks (DNNs) and Large Language Models (LLMs), challenge their practical deployment in edge applications due to high power consumption, area requirements, and privacy concerns.<n>This work presents a novel CMOS-implemented field-programmable neural network architecture for hardware reservoir computing.
arXiv Detail & Related papers (2025-09-22T05:19:46Z)
NNTile: a machine learning framework capable of training extremely large GPT language models on a single node [83.9328245724548]
NNTile is based on a StarPU library, which implements task-based parallelism and schedules all provided tasks onto all available processing units. It means that a particular operation, necessary to train a large neural network, can be performed on any of the CPU cores or GPU devices.
arXiv Detail & Related papers (2025-04-17T16:22:32Z)
Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
In neuromorphic computing, spiking neural networks (SNNs) perform inference tasks, offering significant efficiency gains for workloads involving sequential data. Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy. This paper investigates a wireless neuromorphic split computing architecture employing multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions [2.7086888205833968]
Field-Programmable Gate Array (FPGA) accelerators have proven successful in handling latency- and resource-critical deep neural network (DNN) inference tasks. We propose relaxing the boundaries of neurons and mapping entire sub-networks to a single LUT. We validate our proposed method on a known latency-critical task, jet substructure tagging, and on the classical computer vision task, digit classification using MNIST.
arXiv Detail & Related papers (2024-02-29T16:10:21Z)
Quantization-aware Neural Architectural Search for Intrusion Detection [5.010685611319813]
We present a design methodology that automatically trains and evolves quantized neural network (NN) models that are a thousand times smaller than state-of-the-art NNs. The number of LUTs utilized by this network when deployed to an FPGA is between 2.3x and 8.5x smaller with performance comparable to prior work.
arXiv Detail & Related papers (2023-11-07T18:35:29Z)
Low Precision Quantization-aware Training in Spiking Neural Networks with Differentiable Quantization Function [0.5046831208137847]
This work aims to bridge the gap between recent progress in quantized neural networks and spiking neural networks. It presents an extensive study on the performance of the quantization function, represented as a linear combination of sigmoid functions. The presented quantization function demonstrates the state-of-the-art performance on four popular benchmarks.
arXiv Detail & Related papers (2023-05-30T09:42:05Z)
Vecchia Gaussian Process Ensembles on Internal Representations of Deep Neural Networks [2.186901738997927]
For regression tasks, standard Gaussian processes (GPs) and deep neural networks (DNNs) provide natural uncertainty quantification (UQ) We propose an alternative solution, the deep Vecchia ensemble (DVE), which allows deterministic UQ to work in the presence of feature collapse. DVE is compatible with pretrained networks and incurs low computational overhead.
arXiv Detail & Related papers (2023-05-26T16:19:26Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors. In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL) We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z)
Signal Detection in MIMO Systems with Hardware Imperfections: Message Passing on Neural Networks [101.59367762974371]
In this paper, we investigate signal detection in multiple-input-multiple-output (MIMO) communication systems with hardware impairments. It is difficult to train a deep neural network (DNN) with limited pilot signals, hindering its practical applications. We design an efficient message passing based Bayesian signal detector, leveraging the unitary approximate message passing (UAMP) algorithm.
arXiv Detail & Related papers (2022-10-08T04:32:58Z)
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone. This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge. We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z)
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp) In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z)
LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications [6.9276012494882835]
We present a novel method for designing neural network topologies that directly map to a highly efficient FPGA implementation. We show that the combination of sparsity and low-bit activation quantization results in high-speed circuits with small logic depth and low LUT cost.
arXiv Detail & Related papers (2020-04-06T22:15:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.