Enabling Vibration-Based Gesture Recognition on Everyday Furniture via Energy-Efficient FPGA Implementation of 1D Convolutional Networks
- URL: http://arxiv.org/abs/2510.23156v1
- Date: Mon, 27 Oct 2025 09:30:36 GMT
- Title: Enabling Vibration-Based Gesture Recognition on Everyday Furniture via Energy-Efficient FPGA Implementation of 1D Convolutional Networks
- Authors: Koki Shibata, Tianheng Ling, Chao Qian, Tomokazu Matsui, Hirohiko Suwa, Keiichi Yasumoto, Gregor Schiele,
- Abstract summary: This study proposes an energy-efficient solution deploying compact NNs on low-power Field-Programmable Gate Arrays (FPGAs)<n>We replace complex spectral preprocessing with raw waveform input, eliminating complex on-board preprocessing while reducing input size by 21x without sacrificing accuracy.<n>We design two lightweight architectures (1D-CNN and 1D-SepCNN) tailored for embedded FPGAs, reducing parameters from 369 million to as few as 216 while maintaining comparable accuracy.
- Score: 11.481972015296812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing demand for smart home interfaces has increased interest in non-intrusive sensing methods like vibration-based gesture recognition. While prior studies demonstrated feasibility, they often rely on complex preprocessing and large Neural Networks (NNs) requiring costly high-performance hardware, resulting in high energy usage and limited real-world deployability. This study proposes an energy-efficient solution deploying compact NNs on low-power Field-Programmable Gate Arrays (FPGAs) to enable real-time gesture recognition with competitive accuracy. We adopt a series of optimizations: (1) We replace complex spectral preprocessing with raw waveform input, eliminating complex on-board preprocessing while reducing input size by 21x without sacrificing accuracy. (2) We design two lightweight architectures (1D-CNN and 1D-SepCNN) tailored for embedded FPGAs, reducing parameters from 369 million to as few as 216 while maintaining comparable accuracy. (3) With integer-only quantization and automated RTL generation, we achieve seamless FPGA deployment. A ping-pong buffering mechanism in 1D-SepCNN further improves deployability under tight memory constraints. (4) We extend a hardware-aware search framework to support constraint-driven model configuration selection, considering accuracy, deployability, latency, and energy consumption. Evaluated on two swipe-direction datasets with multiple users and ordinary tables, our approach achieves low-latency, energy-efficient inference on the AMD Spartan-7 XC7S25 FPGA. Under the PS data splitting setting, the selected 6-bit 1D-CNN reaches 0.970 average accuracy across users with 9.22 ms latency. The chosen 8-bit 1D-SepCNN further reduces latency to 6.83 ms (over 53x CPU speedup) with slightly lower accuracy (0.949). Both consume under 1.2 mJ per inference, demonstrating suitability for long-term edge operation.
Related papers
- Hardware-accelerated graph neural networks: an alternative approach for neuromorphic event-based audio classification and keyword spotting on SoC FPGA [12.261656409413753]
FPGA implementation of event-graph neural networks for audio processing.<n>System achieves up to 95% word-end detection accuracy, with only 10.53 microsecond latency and 1.18 W power consumption.
arXiv Detail & Related papers (2026-02-18T13:26:22Z) - Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models [97.55009021098554]
This work aims to identify the key determinants of SLMs' real-device latency and offer generalizable principles and methodologies for SLM design and training.<n>We introduce a new family of hybrid SLMs, called Nemotron-Flash, which significantly advances the accuracy-efficiency frontier of state-of-the-art SLMs.
arXiv Detail & Related papers (2025-11-24T08:46:36Z) - JEDI-linear: Fast and Efficient Graph Neural Networks for Jet Tagging on FPGAs [36.158374493924455]
Graph Neural Networks (GNNs) have shown exceptional performance for jet tagging at the CERN High Luminosity Large Hadron Collider (HLLHC)<n>We propose JEDI-linear, a novel GNN architecture with linear computational complexity that eliminates explicit pairwise interactions.<n>This is the first interaction-based GNN to achieve less than 60ns latency and currently meets the requirements for use in the HL-LHC CMS Level-1 trigger system.
arXiv Detail & Related papers (2025-08-21T11:40:49Z) - Efficient FPGA-accelerated Convolutional Neural Networks for Cloud Detection on CubeSats [0.5420492913071214]
We present the implementation of four FPGA-accelerated convolutional neural network (CNN) models for onboard cloud detection in resource-constrained CubeSat missions.<n>This study explores both pixel-wise (Pixel-Net and Patch-Net) and image-wise (U-Net and Scene-Net) models to benchmark trade-offs in accuracy, latency, and model complexity.<n>All models retained high accuracy post-FPGA integration, with a cumulative maximum accuracy drop of only 0.6% after quantization and pruning.
arXiv Detail & Related papers (2025-04-04T19:32:47Z) - Hardware-Software Co-optimised Fast and Accurate Deep Reconfigurable Spiking Inference Accelerator Architecture Design Methodology [2.968768532937366]
Spiking Neural Networks (SNNs) have emerged as a promising approach to improve the energy efficiency of machine learning models.
We develop a hardware-software co-optimisation strategy to port software-trained deep neural networks (DNN) to reduced-precision spiking models.
arXiv Detail & Related papers (2024-10-07T05:04:13Z) - Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers [56.37495946212932]
Vision transformers (ViTs) have demonstrated their superior accuracy for computer vision tasks compared to convolutional neural networks (CNNs)
This work proposes Quasar-ViT, a hardware-oriented quantization-aware architecture search framework for ViTs.
arXiv Detail & Related papers (2024-07-25T16:35:46Z) - LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy
Physics [45.666822327616046]
This work presents a novel reconfigurable architecture for Low Graph Neural Network (LL-GNN) designs for particle detectors.
The LL-GNN design advances the next generation of trigger systems by enabling sophisticated algorithms to process experimental data efficiently.
arXiv Detail & Related papers (2022-09-28T12:55:35Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware
Multi-Task NLP Inference [82.1584439276834]
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
We present EdgeBERT, an in-depth algorithm- hardware co-design for latency-aware energy optimization for multi-task NLP.
arXiv Detail & Related papers (2020-11-28T19:21:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.