Related papers: Tiny Classifier Circuits: Evolving Accelerators for Tabular Data

Tiny Classifier Circuits: Evolving Accelerators for Tabular Data

URL: http://arxiv.org/abs/2303.00031v2
Date: Thu, 28 Sep 2023 12:57:35 GMT
Title: Tiny Classifier Circuits: Evolving Accelerators for Tabular Data
Authors: Konstantinos Iordanou, Timothy Atkinson, Emre Ozer, Jedrzej Kufel, John Biggs, Gavin Brown and Mikel Lujan
Abstract summary: "Tiny" circuits are so tiny (i.e. consisting of no more than 300 logic gates) that they are called "Tiny" circuits. This paper proposes a methodology for automatically predicting circuits for classification of data with comparable prediction to conventional machine learning.
Score: 0.8936201690845327
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A typical machine learning (ML) development cycle for edge computing is to maximise the performance during model training and then minimise the memory/area footprint of the trained model for deployment on edge devices targeting CPUs, GPUs, microcontrollers, or custom hardware accelerators. This paper proposes a methodology for automatically generating predictor circuits for classification of tabular data with comparable prediction performance to conventional ML techniques while using substantially fewer hardware resources and power. The proposed methodology uses an evolutionary algorithm to search over the space of logic gates and automatically generates a classifier circuit with maximised training prediction accuracy. Classifier circuits are so tiny (i.e., consisting of no more than 300 logic gates) that they are called "Tiny Classifier" circuits, and can efficiently be implemented in ASIC or on an FPGA. We empirically evaluate the automatic Tiny Classifier circuit generation methodology or "Auto Tiny Classifiers" on a wide range of tabular datasets, and compare it against conventional ML techniques such as Amazon's AutoGluon, Google's TabNet and a neural search over Multi-Layer Perceptrons. Despite Tiny Classifiers being constrained to a few hundred logic gates, we observe no statistically significant difference in prediction performance in comparison to the best-performing ML baseline. When synthesised as a Silicon chip, Tiny Classifiers use 8-18x less area and 4-8x less power. When implemented as an ultra-low cost chip on a flexible substrate (i.e., FlexIC), they occupy 10-75x less area and consume 13-75x less power compared to the most hardware-efficient ML baseline. On an FPGA, Tiny Classifiers consume 3-11x fewer resources.

Related papers

Runtime Tunable Tsetlin Machines for Edge Inference on eFPGAs [0.2294388534633318]
eFPGAs allow for the design of hardware accelerators of edge Machine Learning (ML) applications at a lower power budget. The limited eFPGA logic and memory significantly constrain compute capabilities and model size. The proposed eFPGA accelerator focuses on minimizing resource usage and allowing flexibility for on-field recalibration over throughput.
arXiv Detail & Related papers (2025-02-10T12:49:22Z)
Sparser, Better, Faster, Stronger: Sparsity Detection for Efficient Automatic Differentiation [0.0]
Jacobian and Hessian matrices have many potential use cases in Machine Learning (ML)<n>This paper presents advances in sparsity detection, previously the performance bottleneck of Automatic Sparse Differentiation (ASD)<n>We show significant speed-ups of up to three orders of magnitude on real-world problems from scientific ML, graph neural networks and optimization.
arXiv Detail & Related papers (2025-01-29T16:21:54Z)
MATADOR: Automated System-on-Chip Tsetlin Machine Design Generation for Edge Applications [0.2663045001864042]
This paper presents MATADOR, an automated-to-silicon tool with GUI interface capable of optimized accelerator design for inference at the edge. It offers automation of the full development pipeline: model training, system level design generation, design verification and deployment. MATADOR accelerator designs are shown to be up to 13.4x faster, up to 7x more resource frugal and up to 2x more power efficient when compared to state-of-the-art Quantized and Binary Deep Neural Network implementations.
arXiv Detail & Related papers (2024-03-03T10:31:46Z)
Minimally Supervised Learning using Topological Projections in Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs) Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU) Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z)
Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification. Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z)
T-RECX: Tiny-Resource Efficient Convolutional neural networks with early-eXit [0.0]
We show how an early exit intermediate classifier can be enhanced by the addition of an early exit intermediate classifier. Our technique is optimized specifically for tiny-CNN sized models. Our results show that T-RecX 1) improves the accuracy of baseline network, 2) achieves 31.58% average reduction in FLOPS in exchange for one percent accuracy across all evaluated models.
arXiv Detail & Related papers (2022-07-14T02:05:43Z)
Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks. To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings. We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z)
BSC: Block-based Stochastic Computing to Enable Accurate and Efficient TinyML [10.294484356351152]
Machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving. Today, more applications require ML on tiny devices with extremely limited resources, like implantable cardioverter defibrillator (ICD) which is known as TinyML. Unlike ML on the edge, TinyML with a limited energy supply has higher demands on low-power execution.
arXiv Detail & Related papers (2021-11-12T12:28:05Z)
A TinyML Platform for On-Device Continual Learning with Quantized Latent Replays [66.62377866022221]
Latent Replay-based Continual Learning (CL) techniques enable online, serverless adaptation in principle. We introduce a HW/SW platform for end-to-end CL based on a 10-core FP32-enabled parallel ultra-low-power processor. Our results show that by combining these techniques, continual learning can be achieved in practice using less than 64MB of memory.
arXiv Detail & Related papers (2021-10-20T11:01:23Z)
Generalized Learning Vector Quantization for Classification in Randomized Neural Networks and Hyperdimensional Computing [4.4886210896619945]
We propose a modified RVFL network that avoids computationally expensive matrix operations during training. The proposed approach achieved state-of-the-art accuracy on a collection of datasets from the UCI Machine Learning Repository.
arXiv Detail & Related papers (2021-06-17T21:17:17Z)
VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator. textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z)
Predictive Coding Approximates Backprop along Arbitrary Computation Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents. Our models perform equivalently to backprop on challenging machine learning benchmarks. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.