Related papers: Realtime Facial Expression Recognition: Neuromorphic Hardware vs. Edge AI Accelerators

Realtime Facial Expression Recognition: Neuromorphic Hardware vs. Edge AI Accelerators

URL: http://arxiv.org/abs/2403.08792v1
Date: Tue, 30 Jan 2024 16:12:20 GMT
Title: Realtime Facial Expression Recognition: Neuromorphic Hardware vs. Edge AI Accelerators
Authors: Heath Smith, James Seekings, Mohammadreza Mohammadi, Ramtin Zand,
Abstract summary: The paper focuses on real-time facial expression recognition (FER) systems as an important component in various real-world applications such as social robotics. We investigate two hardware options for the deployment of FER machine learning (ML) models at the edge: neuromorphic hardware versus edge AI accelerators.
Score: 0.5492530316344587
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The paper focuses on real-time facial expression recognition (FER) systems as an important component in various real-world applications such as social robotics. We investigate two hardware options for the deployment of FER machine learning (ML) models at the edge: neuromorphic hardware versus edge AI accelerators. Our study includes exhaustive experiments providing comparative analyses between the Intel Loihi neuromorphic processor and four distinct edge platforms: Raspberry Pi-4, Intel Neural Compute Stick (NSC), Jetson Nano, and Coral TPU. The results obtained show that Loihi can achieve approximately two orders of magnitude reduction in power dissipation and one order of magnitude energy savings compared to Coral TPU which happens to be the least power-intensive and energy-consuming edge AI accelerator. These reductions in power and energy are achieved while the neuromorphic solution maintains a comparable level of accuracy with the edge accelerators, all within the real-time latency requirements.

Related papers

Embedded FPGA Acceleration of Brain-Like Neural Networks: Online Learning to Scalable Inference [0.0]
We present the first embedded FPGA accelerator for BCPNN on a Zynq UltraScale+ system using High-Level Synthesis.<n>Our accelerator achieves up to 17.5x latency and 94% energy savings over ARM baselines, without sacrificing accuracy.<n>This work enables practical neuromorphic computing on edge devices, bridging the gap between brain-like learning and real-world deployment.
arXiv Detail & Related papers (2025-06-23T11:35:20Z)
Energy-Efficient Spiking Recurrent Neural Network for Gesture Recognition on Embedded GPUs [1.37621344207686]
This research explores the deployment of a spiking recurrent neural network (SRNN) with liquid time constant neurons for gesture recognition. We focus on the energy efficiency and computational efficacy of NVIDIA Jetson Nano embedded GPU platforms.
arXiv Detail & Related papers (2024-08-23T10:50:29Z)
Single Neuromorphic Memristor closely Emulates Multiple Synaptic Mechanisms for Energy Efficient Neural Networks [71.79257685917058]
We demonstrate memristive nano-devices based on SrTiO3 that inherently emulate all these synaptic functions. These memristors operate in a non-filamentary, low conductance regime, which enables stable and energy efficient operation.
arXiv Detail & Related papers (2024-02-26T15:01:54Z)
Using the Abstract Computer Architecture Description Language to Model AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements. The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams. In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z)
Random resistive memory-based deep extreme point learning machine for unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM) Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z)
Neural Network Methods for Radiation Detectors and Imaging [1.6395318070400589]
Recent advances in machine learning and especially deep neural networks (DNNs) allow for new optimization and performance-enhancement schemes for radiation detectors and imaging hardware. We give an overview of data generation at photon sources, deep learning-based methods for image processing tasks, and hardware solutions for deep learning acceleration.
arXiv Detail & Related papers (2023-11-09T20:21:51Z)
Facial Expression Recognition at the Edge: CPU vs GPU vs VPU vs TPU [0.0]
We present a hierarchical framework for developing and optimizing hardware-aware CNNs tuned for deployment at the edge. We achieved a peak accuracy of 99.49% when testing on the CK+ facial expression recognition dataset.
arXiv Detail & Related papers (2023-05-17T03:19:06Z)
ETLP: Event-based Three-factor Local Plasticity for online learning with neuromorphic hardware [105.54048699217668]
We show a competitive performance in accuracy with a clear advantage in the computational complexity for Event-Based Three-factor Local Plasticity (ETLP) We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learntemporal patterns with a rich temporal structure.
arXiv Detail & Related papers (2023-01-19T19:45:42Z)
Energy-Efficient Deployment of Machine Learning Workloads on Neuromorphic Hardware [0.11744028458220425]
Several edge deep learning hardware accelerators have been released that specifically focus on reducing the power and area consumed by deep neural networks (DNNs) Spiked neural networks (SNNs) which operate on discrete time-series data have been shown to achieve substantial power reductions when deployed on specialized neuromorphic event-based/asynchronous hardware. In this work, we provide a general guide to converting pre-trained DNNs into SNNs while also presenting techniques to improve the deployment of converted SNNs on neuromorphic hardware.
arXiv Detail & Related papers (2022-10-10T20:27:19Z)
Braille Letter Reading: A Benchmark for Spatio-Temporal Pattern Recognition on Neuromorphic Hardware [50.380319968947035]
Recent deep learning approaches have reached accuracy in such tasks, but their implementation on conventional embedded solutions is still computationally very and energy expensive. We propose a new benchmark for computing tactile pattern recognition at the edge through letters reading. We trained and compared feed-forward and recurrent spiking neural networks (SNNs) offline using back-propagation through time with surrogate gradients, then we deployed them on the Intel Loihimorphic chip for efficient inference. Our results show that the LSTM outperforms the recurrent SNN in terms of accuracy by 14%. However, the recurrent SNN on Loihi is 237 times more energy
arXiv Detail & Related papers (2022-05-30T14:30:45Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.