Related papers: Differentiable Weightless Neural Networks

Differentiable Weightless Neural Networks

URL: http://arxiv.org/abs/2410.11112v2
Date: Thu, 21 Nov 2024 18:00:19 GMT
Title: Differentiable Weightless Neural Networks
Authors: Alan T. L. Bacellar, Zachary Susskind, Mauricio Breternitz Jr., Eugene John, Lizy K. John, Priscila M. V. Lima, Felipe M. G. França,
Abstract summary: We introduce the Differentiable Weightless Neural Network (DWN), a model based on interconnected lookup tables. We propose Learnable Mapping, Learnable Reduction, and Spectral Regularization to further improve the accuracy and efficiency of these models.
Score: 1.453874055504475
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce the Differentiable Weightless Neural Network (DWN), a model based on interconnected lookup tables. Training of DWNs is enabled by a novel Extended Finite Difference technique for approximate differentiation of binary values. We propose Learnable Mapping, Learnable Reduction, and Spectral Regularization to further improve the accuracy and efficiency of these models. We evaluate DWNs in three edge computing contexts: (1) an FPGA-based hardware accelerator, where they demonstrate superior latency, throughput, energy efficiency, and model area compared to state-of-the-art solutions, (2) a low-power microcontroller, where they achieve preferable accuracy to XGBoost while subject to stringent memory constraints, and (3) ultra-low-cost chips, where they consistently outperform small models in both accuracy and projected hardware area. DWNs also compare favorably against leading approaches for tabular datasets, with higher average rank. Overall, our work positions DWNs as a pioneering solution for edge-compatible high-throughput neural networks.

Related papers

Fast State-Augmented Learning for Wireless Resource Allocation with Dual Variable Regression [83.27791109672927]
We show how a state-augmented graph neural network (GNN) parametrization for the resource allocation policy circumvents the drawbacks of the ubiquitous dual subgradient methods.<n>Lagrangian maximizing state-augmented policies are learned during the offline training phase.<n>We prove a convergence result and an exponential probability bound on the excursions of the dual function (iterate) optimality gaps.
arXiv Detail & Related papers (2025-06-23T15:20:58Z)
Model-driven deep neural network for enhanced direction finding with commodity 5G gNodeB [30.94668439883861]
Current wireless networks heavily rely on pure model-driven techniques to achieve positioning functionality. Here we reformulate the direction finding or angle-of-arrival (AoA) estimation problem as an image recovery task of the spatial spectrum. We show that the proposed MoD-DNN framework enables effective spectrum calibration and accurate AoA estimation.
arXiv Detail & Related papers (2024-12-14T02:09:36Z)
Inference-to-complete: A High-performance and Programmable Data-plane Co-processor for Neural-network-driven Traffic Analysis [18.75879653408466]
NN-driven intelligent data-plane (NN-driven IDP) is becoming an emerging topic for excellent accuracy and high performance. Kaleidoscope is a flexible and high-performance co-processor located at the bypass of the data-plane. Kaleidoscope reaches 256-352 ns inference latency and 100 Gbps throughput with negligible influence on the data-plane.
arXiv Detail & Related papers (2024-11-01T07:10:08Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency. We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs) We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z)
Degree-Conscious Spiking Graph for Cross-Domain Adaptation [51.58506501415558]
Spiking Graph Networks (SGNs) have demonstrated significant potential in graph classification.<n>We introduce a novel framework named Degree-Consicious Spiking Graph for Cross-Domain Adaptation (DeSGraDA)<n>DeSGraDA enhances generalization across domains with three key components.
arXiv Detail & Related papers (2024-10-09T13:45:54Z)
DiTMoS: Delving into Diverse Tiny-Model Selection on Microcontrollers [34.282971510732736]
We introduce DiTMoS, a novel DNN training and inference framework with a selector-classifiers architecture. A composition of weak models can exhibit high diversity and the union of them can significantly boost the accuracy upper bound. We deploy DiTMoS on the Neucleo STM32F767ZI board and evaluate it based on three time-series datasets for human activity recognition, keywords spotting, and emotion recognition.
arXiv Detail & Related papers (2024-03-14T02:11:38Z)
Precision-aware Latency and Energy Balancing on Multi-Accelerator Platforms for DNN Inference [22.9834921448069]
We propose ODiMO, a hardware-aware tool that performs a fine-grain mapping across different accelerators on-chip. We show that ODiMO reduces energy/latency by up to 33%/31% with limited accuracy drop (-0.53%/-0.32%) compared to manual mappings.
arXiv Detail & Related papers (2023-06-08T09:23:46Z)
Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on Edge Devices [19.335535517714703]
The prediction accuracy of the deep neural networks (DNNs) after deployment at the edge can suffer with time due to shifts in the distribution of the new data. Recent prediction-time unsupervised DNN adaptation techniques have been introduced that improve prediction accuracy of the models for noisy data by re-tuning the batch normalization parameters. This paper, for the first time, performs a comprehensive measurement study of such techniques to quantify their performance and energy on various edge devices.
arXiv Detail & Related papers (2022-03-21T19:10:40Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks [20.374784902476318]
Pruning, as a method to introduce zeros to model weights, has shown to be an effective method to provide good trade-offs between model accuracy and computation efficiency. Some modern processors are equipped with fast on-chip scratchpad memories and gather/scatter engines that perform indirect load and store operations on such memories. In this work, we propose a set of novel sparse patterns, named gather-scatter (GS) patterns, to utilize the scratchpad memories and gather/scatter engines to speed up neural network inferences.
arXiv Detail & Related papers (2021-12-20T22:55:45Z)
PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving. Current approaches suffer from sparse and partial point clouds of distant and occluded objects. In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z)
Binarized Graph Neural Network [65.20589262811677]
We develop a binarized graph neural network to learn the binary representations of the nodes with binary network parameters. Our proposed method can be seamlessly integrated into the existing GNN-based embedding approaches. Experiments indicate that the proposed binarized graph neural network, namely BGN, is orders of magnitude more efficient in terms of both time and space.
arXiv Detail & Related papers (2020-04-19T09:43:14Z)
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.