Tiny Classifier Circuits: Evolving Accelerators for Tabular Data
- URL: http://arxiv.org/abs/2303.00031v2
- Date: Thu, 28 Sep 2023 12:57:35 GMT
- Title: Tiny Classifier Circuits: Evolving Accelerators for Tabular Data
- Authors: Konstantinos Iordanou, Timothy Atkinson, Emre Ozer, Jedrzej Kufel,
John Biggs, Gavin Brown and Mikel Lujan
- Abstract summary: "Tiny" circuits are so tiny (i.e. consisting of no more than 300 logic gates) that they are called "Tiny" circuits.
This paper proposes a methodology for automatically predicting circuits for classification of data with comparable prediction to conventional machine learning.
- Score: 0.8936201690845327
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A typical machine learning (ML) development cycle for edge computing is to
maximise the performance during model training and then minimise the
memory/area footprint of the trained model for deployment on edge devices
targeting CPUs, GPUs, microcontrollers, or custom hardware accelerators. This
paper proposes a methodology for automatically generating predictor circuits
for classification of tabular data with comparable prediction performance to
conventional ML techniques while using substantially fewer hardware resources
and power. The proposed methodology uses an evolutionary algorithm to search
over the space of logic gates and automatically generates a classifier circuit
with maximised training prediction accuracy. Classifier circuits are so tiny
(i.e., consisting of no more than 300 logic gates) that they are called "Tiny
Classifier" circuits, and can efficiently be implemented in ASIC or on an FPGA.
We empirically evaluate the automatic Tiny Classifier circuit generation
methodology or "Auto Tiny Classifiers" on a wide range of tabular datasets, and
compare it against conventional ML techniques such as Amazon's AutoGluon,
Google's TabNet and a neural search over Multi-Layer Perceptrons. Despite Tiny
Classifiers being constrained to a few hundred logic gates, we observe no
statistically significant difference in prediction performance in comparison to
the best-performing ML baseline. When synthesised as a Silicon chip, Tiny
Classifiers use 8-18x less area and 4-8x less power. When implemented as an
ultra-low cost chip on a flexible substrate (i.e., FlexIC), they occupy 10-75x
less area and consume 13-75x less power compared to the most hardware-efficient
ML baseline. On an FPGA, Tiny Classifiers consume 3-11x fewer resources.
Related papers
- MATADOR: Automated System-on-Chip Tsetlin Machine Design Generation for Edge Applications [0.2663045001864042]
This paper presents MATADOR, an automated-to-silicon tool with GUI interface capable of optimized accelerator design for inference at the edge.
It offers automation of the full development pipeline: model training, system level design generation, design verification and deployment.
MATADOR accelerator designs are shown to be up to 13.4x faster, up to 7x more resource frugal and up to 2x more power efficient when compared to state-of-the-art Quantized and Binary Deep Neural Network implementations.
arXiv Detail & Related papers (2024-03-03T10:31:46Z) - Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Incremental Online Learning Algorithms Comparison for Gesture and Visual
Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification.
Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z) - T-RECX: Tiny-Resource Efficient Convolutional neural networks with
early-eXit [0.0]
We show how an early exit intermediate classifier can be enhanced by the addition of an early exit intermediate classifier.
Our technique is optimized specifically for tiny-CNN sized models.
Our results show that T-RecX 1) improves the accuracy of baseline network, 2) achieves 31.58% average reduction in FLOPS in exchange for one percent accuracy across all evaluated models.
arXiv Detail & Related papers (2022-07-14T02:05:43Z) - Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling
and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks.
To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings.
We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z) - BSC: Block-based Stochastic Computing to Enable Accurate and Efficient
TinyML [10.294484356351152]
Machine learning (ML) has been successfully applied to edge applications, such as smart phones and automated driving.
Today, more applications require ML on tiny devices with extremely limited resources, like implantable cardioverter defibrillator (ICD) which is known as TinyML.
Unlike ML on the edge, TinyML with a limited energy supply has higher demands on low-power execution.
arXiv Detail & Related papers (2021-11-12T12:28:05Z) - A TinyML Platform for On-Device Continual Learning with Quantized Latent
Replays [66.62377866022221]
Latent Replay-based Continual Learning (CL) techniques enable online, serverless adaptation in principle.
We introduce a HW/SW platform for end-to-end CL based on a 10-core FP32-enabled parallel ultra-low-power processor.
Our results show that by combining these techniques, continual learning can be achieved in practice using less than 64MB of memory.
arXiv Detail & Related papers (2021-10-20T11:01:23Z) - Generalized Learning Vector Quantization for Classification in
Randomized Neural Networks and Hyperdimensional Computing [4.4886210896619945]
We propose a modified RVFL network that avoids computationally expensive matrix operations during training.
The proposed approach achieved state-of-the-art accuracy on a collection of datasets from the UCI Machine Learning Repository.
arXiv Detail & Related papers (2021-06-17T21:17:17Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - Predictive Coding Approximates Backprop along Arbitrary Computation
Graphs [68.8204255655161]
We develop a strategy to translate core machine learning architectures into their predictive coding equivalents.
Our models perform equivalently to backprop on challenging machine learning benchmarks.
Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
arXiv Detail & Related papers (2020-06-07T15:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.