TinyM$^2$Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment
- URL: http://arxiv.org/abs/2405.12353v1
- Date: Mon, 20 May 2024 20:03:51 GMT
- Title: TinyM$^2$Net-V3: Memory-Aware Compressed Multimodal Deep Neural Networks for Sustainable Edge Deployment
- Authors: Hasib-Al Rashid, Tinoosh Mohsenin,
- Abstract summary: This work introduces TinyM$2$Net-V3, a system that processes different modalities of complementary data, designs deep neural network (DNN) models, and employs model compression techniques.
Our tiny machine learning models, deployed on resource limited hardware, demonstrated low latencies within milliseconds and very high power efficiency.
- Score: 0.5893124686141782
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advancement of sophisticated artificial intelligence (AI) algorithms has led to a notable increase in energy usage and carbon dioxide emissions, intensifying concerns about climate change. This growing problem has brought the environmental sustainability of AI technologies to the forefront, especially as they expand across various sectors. In response to these challenges, there is an urgent need for the development of sustainable AI solutions. These solutions must focus on energy-efficient embedded systems that are capable of handling diverse data types even in environments with limited resources, thereby ensuring both technological progress and environmental responsibility. Integrating complementary multimodal data into tiny machine learning models for edge devices is challenging due to increased complexity, latency, and power consumption. This work introduces TinyM$^2$Net-V3, a system that processes different modalities of complementary data, designs deep neural network (DNN) models, and employs model compression techniques including knowledge distillation and low bit-width quantization with memory-aware considerations to fit models within lower memory hierarchy levels, reducing latency and enhancing energy efficiency on resource-constrained devices. We evaluated TinyM$^2$Net-V3 in two multimodal case studies: COVID-19 detection using cough, speech, and breathing audios, and pose classification from depth and thermal images. With tiny inference models (6 KB and 58 KB), we achieved 92.95% and 90.7% accuracies, respectively. Our tiny machine learning models, deployed on resource limited hardware, demonstrated low latencies within milliseconds and very high power efficiency.
Related papers
- SCAR: Scheduling Multi-Model AI Workloads on Heterogeneous Multi-Chiplet Module Accelerators [12.416683044819955]
Multi-model workloads with heavy models like recent large language models significantly increased the compute and memory demands on hardware.
To address such increasing demands, designing a scalable hardware architecture became a key problem.
We develop a set of schedulers to navigate the huge scheduling space and codify them into a scheduler with advanced techniques such as inter-chiplet pipelining.
arXiv Detail & Related papers (2024-05-01T18:02:25Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Green Edge AI: A Contemporary Survey [49.47249665895926]
We present a contemporary survey on green edge AI.
Despite its potential, edge AI faces substantial challenges, mostly due to the dichotomy between the resource limitations of wireless edge networks and the resource-intensive nature of deep learning (DL)
We explore energy-efficient design methodologies for the three critical tasks in edge AI systems, including training data acquisition, edge training, and edge inference.
arXiv Detail & Related papers (2023-12-01T04:04:37Z) - Power Hungry Processing: Watts Driving the Cost of AI Deployment? [74.19749699665216]
generative, multi-purpose AI systems promise a unified approach to building machine learning (ML) models into technology.
This ambition of generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit.
We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models.
We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions
arXiv Detail & Related papers (2023-11-28T15:09:36Z) - EVE: Environmental Adaptive Neural Network Models for Low-power Energy
Harvesting System [8.16411986220709]
Energy harvesting technology that harvests energy from ambient environment is a promising alternative to batteries for powering those devices.
This paper proposes EVE, an automated machine learning framework to search for desired multi-models with shared weights for energy harvesting IoT devices.
Experimental results show that the neural networks models generated by EVE is on average 2.5X faster than the baseline models without pruning and shared weights.
arXiv Detail & Related papers (2022-07-14T20:53:46Z) - Energy-efficient Deployment of Deep Learning Applications on Cortex-M
based Microcontrollers using Deep Compression [1.4050836886292872]
This paper investigates the efficient deployment of deep learning models on resource-constrained microcontrollers.
We present a methodology for the systematic exploration of different DNN pruning, quantization, and deployment strategies.
We show that we can compress them to below 10% of their original parameter count before their predictive quality decreases.
arXiv Detail & Related papers (2022-05-20T10:55:42Z) - Pervasive Machine Learning for Smart Radio Environments Enabled by
Reconfigurable Intelligent Surfaces [56.35676570414731]
The emerging technology of Reconfigurable Intelligent Surfaces (RISs) is provisioned as an enabler of smart wireless environments.
RISs offer a highly scalable, low-cost, hardware-efficient, and almost energy-neutral solution for dynamic control of the propagation of electromagnetic signals over the wireless medium.
One of the major challenges with the envisioned dense deployment of RISs in such reconfigurable radio environments is the efficient configuration of multiple metasurfaces.
arXiv Detail & Related papers (2022-05-08T06:21:33Z) - YONO: Modeling Multiple Heterogeneous Neural Networks on
Microcontrollers [10.420617367363047]
YONO is a product quantization (PQ) based approach that compresses multiple heterogeneous models and enables in-memory model execution and switching.
YONO shows remarkable performance as it can compress multiple heterogeneous models with negligible or no loss of accuracy up to 12.37$times$.
arXiv Detail & Related papers (2022-03-08T01:24:36Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Prune2Edge: A Multi-Phase Pruning Pipelines to Deep Ensemble Learning in
IIoT [0.0]
We propose a novel edge-based multi-phase pruning pipelines to ensemble learning on IIoT devices.
In the first phase, we generate a diverse ensemble of pruned models, then we apply integer quantisation, next we prune the generated ensemble using a clustering-based technique.
Our proposed approach was able to outperform the predictability levels of a baseline model.
arXiv Detail & Related papers (2020-04-09T17:44:34Z) - Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A
Multi-Agent Deep Reinforcement Learning Approach [82.6692222294594]
We study a risk-aware energy scheduling problem for a microgrid-powered MEC network.
We derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based advantage actor-critic (A3C) algorithm with shared neural networks.
arXiv Detail & Related papers (2020-02-21T02:14:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.