Related papers: MicroBi-ConvLSTM: An Ultra-Lightweight Efficient Model for Human Activity Recognition on Resource Constrained Devices

MicroBi-ConvLSTM: An Ultra-Lightweight Efficient Model for Human Activity Recognition on Resource Constrained Devices

URL: http://arxiv.org/abs/2602.06523v1
Date: Fri, 06 Feb 2026 09:26:29 GMT
Title: MicroBi-ConvLSTM: An Ultra-Lightweight Efficient Model for Human Activity Recognition on Resource Constrained Devices
Authors: Mridankan Mandal,
Abstract summary: Human Activity Recognition (HAR) on resource constrained wearables requires models that balance accuracy against strict memory and computational budgets.<n>We present MicroBi-ConvLSTM, an ultra-lightweight convolutional-recurrent architecture achieving 11.4K parameters on average.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Human Activity Recognition (HAR) on resource constrained wearables requires models that balance accuracy against strict memory and computational budgets. State of the art lightweight architectures such as TinierHAR (34K parameters) and TinyHAR (55K parameters) achieve strong accuracy, but exceed memory budgets of microcontrollers with limited SRAM once operating system overhead is considered. We present MicroBi-ConvLSTM, an ultra-lightweight convolutional-recurrent architecture achieving 11.4K parameters on average through two stage convolutional feature extraction with 4x temporal pooling and a single bidirectional LSTM layer. This represents 2.9x parameter reduction versus TinierHAR and 11.9x versus DeepConvLSTM while preserving linear O(N) complexity. Evaluation across eight diverse HAR benchmarks shows that MicroBi-ConvLSTM maintains competitive performance within the ultra-lightweight regime: 93.41% macro F1 on UCI-HAR, 94.46% on SKODA assembly gestures, and 88.98% on Daphnet gait freeze detection. Systematic ablation reveals task dependent component contributions where bidirectionality benefits episodic event detection, but provides marginal gains on periodic locomotion. INT8 post training quantization incurs only 0.21% average F1-score degradation, yielding a 23.0 KB average deployment footprint suitable for memory constrained edge devices.

Related papers

BabyMamba-HAR: Lightweight Selective State Space Models for Efficient Human Activity Recognition on Resource Constrained Devices [0.0]
Human activity recognition (HAR) on wearable and mobile devices is constrained by memory footprint and computational budget.<n> Selective state space models (SSMs) offer linear time sequence processing with input dependent gating.<n>BabyMamba-HAR is introduced, a framework comprising two novel lightweight Mamba inspired architectures for resource constrained HAR.
arXiv Detail & Related papers (2026-02-10T15:16:32Z)
Real-Time Human Activity Recognition on Edge Microcontrollers: Dynamic Hierarchical Inference with Multi-Spectral Sensor Fusion [7.184610830886172]
We propose a resource-aware hierarchical network based on multi-spectral fusion and interpretable modules.<n> Deployed on an ARM Cortex-M4 microcontroller for low-power real-time inference, HPPI-Net achieves 96.70% accuracy.<n>Compared with MobileNetV3, HPPI-Net improves accuracy by 1.22% and reduces RAM usage by 71.2% and ROM usage by 42.1%.
arXiv Detail & Related papers (2026-01-29T15:21:45Z)
DS-CIM: Digital Stochastic Computing-In-Memory Featuring Accurate OR-Accumulation via Sample Region Remapping for Edge AI Models [8.92683306412944]
This paper introduces a digital CIM (DS-CIM) architecture that achieves both high accuracy and efficiency.<n>We implement multiply-accumulation (MAC) in a compact, unsigned OR-based circuit by modifying the data representation.<n>Our core strategy, a shared Random Number Generator (PRNG) with 2D, enables single-cycle mutually exclusive activation to eliminate OR-gate collisions.
arXiv Detail & Related papers (2026-01-10T23:56:33Z)
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning [73.10669391954801]
We present the Ring-linear model series, specifically including Ring-mini-linear-2.0 and Ring-flash-linear-2.0.<n>Both models adopt a hybrid architecture that effectively integrates linear attention and softmax attention.<n>Compared to a 32 billion parameter dense model, this series reduces inference cost to 1/10, and compared to the original Ring series, the cost is also reduced by over 50%.
arXiv Detail & Related papers (2025-10-22T07:59:38Z)
AMS-QUANT: Adaptive Mantissa Sharing for Floating-point Quantization [7.413057271242686]
Quantization, particularly floating-point quantization, is known to be capable of speeding up large language models (LLMs) inference.<n>We propose AMS-Quant, which explores floating-point quantization exploration from integer bitwidths to non-integer bit-widths.<n>We show that AMS-Quant can quantize the model to FP-5.33-e2m3 and FP4.25-e2m2, and significantly speed up the decoding over FP16 inference.
arXiv Detail & Related papers (2025-10-16T15:37:23Z)
EfficientLLM: Efficiency in Large Language Models [64.3537131208038]
Large Language Models (LLMs) have driven significant progress, yet their growing counts and context windows incur prohibitive compute, energy, and monetary costs.<n>We introduce EfficientLLM, a novel benchmark and the first comprehensive empirical study evaluating efficiency techniques for LLMs at scale.
arXiv Detail & Related papers (2025-05-20T02:27:08Z)
RadMamba: Efficient Human Activity Recognition through Radar-based Micro-Doppler-Oriented Mamba State-Space Model [4.200994756075655]
This paper introduces RadMamba, a parameter-efficient, radar micro-Doppler-oriented Mamba SSM specifically tailored for radar-based HAR.<n>Across three diverse datasets, RadMamba matches the top-performing previous model's 99.8% classification accuracy.
arXiv Detail & Related papers (2025-04-16T12:54:11Z)
DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data. It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z)
HARMamba: Efficient and Lightweight Wearable Sensor Human Activity Recognition Based on Bidirectional Mamba [7.412537185607976]
Wearable sensor-based human activity recognition (HAR) is a critical research domain in activity perception. This study introduces HARMamba, an innovative light-weight and versatile HAR architecture that combines selective bidirectional State Spaces Model and hardware-aware design. HarMamba outperforms contemporary state-of-the-art frameworks, delivering comparable or better accuracy with significantly reducing computational and memory demands.
arXiv Detail & Related papers (2024-03-29T13:57:46Z)
DB-LLM: Accurate Dual-Binarization for Efficient LLMs [83.70686728471547]
Large language models (LLMs) have significantly advanced the field of natural language processing. Existing ultra-low-bit quantization always causes severe accuracy drops. We propose a novel Dual-Binarization method for LLMs, namely DB-LLM.
arXiv Detail & Related papers (2024-02-19T09:04:30Z)
Quantized Neural Networks for Low-Precision Accumulation with Guaranteed Overflow Avoidance [68.8204255655161]
We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference. We evaluate our algorithm across multiple quantized models that we train for different tasks, showing that our approach can reduce the precision of accumulators while maintaining model accuracy with respect to a floating-point baseline.
arXiv Detail & Related papers (2023-01-31T02:46:57Z)
Federated Learning for Energy-limited Wireless Networks: A Partial Model Aggregation Approach [79.59560136273917]
limited communication resources, bandwidth and energy, and data heterogeneity across devices are main bottlenecks for federated learning (FL) We first devise a novel FL framework with partial model aggregation (PMA) The proposed PMA-FL improves 2.72% and 11.6% accuracy on two typical heterogeneous datasets.
arXiv Detail & Related papers (2022-04-20T19:09:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.