Related papers: Person Detection Using an Ultra Low-resolution Thermal Imager on a Low-cost MCU

Person Detection Using an Ultra Low-resolution Thermal Imager on a Low-cost MCU

URL: http://arxiv.org/abs/2212.08415v1
Date: Fri, 16 Dec 2022 11:27:50 GMT
Title: Person Detection Using an Ultra Low-resolution Thermal Imager on a Low-cost MCU
Authors: Maarten Vandersteegen, Wouter Reusen, Kristof Van Beeck, Toon Goedem\'e
Abstract summary: We propose a novel ultra-lightweight CNN-based person detector that processes thermal video from a low-cost 32x24 pixel static imager. Our model achieves up to 91.62% accuracy (F1-score), has less than 10k parameters, and runs as fast as 87ms and 46ms on low-cost microcontrollers.
Score: 5.684789481921423
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting persons in images or video with neural networks is a well-studied subject in literature. However, such works usually assume the availability of a camera of decent resolution and a high-performance processor or GPU to run the detection algorithm, which significantly increases the cost of a complete detection system. However, many applications require low-cost solutions, composed of cheap sensors and simple microcontrollers. In this paper, we demonstrate that even on such hardware we are not condemned to simple classic image processing techniques. We propose a novel ultra-lightweight CNN-based person detector that processes thermal video from a low-cost 32x24 pixel static imager. Trained and compressed on our own recorded dataset, our model achieves up to 91.62% accuracy (F1-score), has less than 10k parameters, and runs as fast as 87ms and 46ms on low-cost microcontrollers STM32F407 and STM32F746, respectively.

Related papers

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget [53.311109531586844]
We demonstrate very low-cost training of large-scale T2I diffusion transformer models. We train a 1.16 billion parameter sparse transformer with only $1,890 economical cost and achieve a 12.7 FID in zero-shot generation. We aim to release our end-to-end training pipeline to further democratize the training of large-scale diffusion models on micro-budgets.
arXiv Detail & Related papers (2024-07-22T17:23:28Z)
Resource-Efficient Gesture Recognition using Low-Resolution Thermal Camera via Spiking Neural Networks and Sparse Segmentation [1.7758299835471887]
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor. Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations. This paper shows that the innovative use of the recently proposed Monostable Multivibrator (MMV) neural networks as a new class of SNN achieves more than one order of magnitude smaller memory and compute complexity.
arXiv Detail & Related papers (2024-01-12T13:20:01Z)
Practical cross-sensor color constancy using a dual-mapping strategy [0.0]
The proposed method uses a dual-mapping strategy and only requires a simple white point from a test sensor under a D65 condition. In the second mapping phase, we transform the re-constructed image data into sparse features, which are then optimized with a lightweight multi-layer perceptron (MLP) model. This approach effectively reduces sensor discrepancies and delivers performance on par with leading cross-sensor methods.
arXiv Detail & Related papers (2023-11-20T13:58:59Z)
Privacy-Preserving Person Detection Using Low-Resolution Infrared Cameras [9.801893730708134]
In intelligent building management, knowing the number of people and their location in a room are important for better control of its illumination, ventilation, and heating with reduced costs and improved comfort. This is typically achieved by detecting people using embedded devices that are installed on the room's ceiling, and that integrate low-resolution infrared camera, which conceals each person's identity. For accurate detection, state-of-the-art deep learning models still require supervised training using a large annotated dataset of images. In this paper, we investigate cost-effective methods that are suitable for person detection based on low-resolution infrared images
arXiv Detail & Related papers (2022-09-22T22:20:30Z)
ETAD: A Unified Framework for Efficient Temporal Action Detection [70.21104995731085]
Untrimmed video understanding such as temporal action detection (TAD) often suffers from the pain of huge demand for computing resources. We build a unified framework for efficient end-to-end temporal action detection (ETAD) ETAD achieves state-of-the-art performance on both THUMOS-14 and ActivityNet-1.3.
arXiv Detail & Related papers (2022-05-14T21:16:21Z)
Privacy-preserving Social Distance Monitoring on Microcontrollers with Low-Resolution Infrared Sensors and CNNs [10.80166668204102]
Low-resolution infrared (IR) array sensors offer a low-cost, low-power, and privacy-preserving alternative to optical cameras and smartphones/wearables. We demonstrate that an accurate detection of social distance violations can be achieved processing the raw output of a 8x8 IR array sensor with a small-sized Convolutional Neural Network (CNN) We show that our best CNN achieves 86.3% balanced accuracy, significantly outperforming the 61% achieved by a state-of-the-art deterministic algorithm.
arXiv Detail & Related papers (2022-04-22T07:17:45Z)
FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks. Current networks often occupy large number of parameters and require heavy computation costs. Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z)
CNNs for JPEGs: A Study in Computational Cost [49.97673761305336]
Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade. CNNs are capable of learning robust representations of the data directly from the RGB pixels. Deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years.
arXiv Detail & Related papers (2020-12-26T15:00:10Z)
Real-Time Resource Allocation for Tracking Systems [54.802447204921634]
We propose a new algorithm called emphPartiMax that greatly reduces this cost by applying the person detector only to the relevant parts of the image. PartiMax exploits information in the particle filter to select $k$ of the $n$ candidate emphpixel boxes in the image. We show that our system runs in real-time by processing only 10% of the pixel boxes in the image while still retaining 80% of the original tracking performance achieved when processing all pixel boxes.
arXiv Detail & Related papers (2020-09-21T08:29:05Z)
Low-latency hand gesture recognition with a low resolution thermal imager [4.063682271487617]
We propose an algorithm that predicts hand gestures using a cheap low-resolution thermal camera with only 32x24 pixels. Our best model achieves 95.9% classification accuracy and 83% mAP detection accuracy while its processing pipeline has a latency of only one frame.
arXiv Detail & Related papers (2020-04-24T09:43:48Z)
Near-chip Dynamic Vision Filtering for Low-Bandwidth Pedestrian Detection [99.94079901071163]
This paper presents a novel end-to-end system for pedestrian detection using Dynamic Vision Sensors (DVSs) We target applications where multiple sensors transmit data to a local processing unit, which executes a detection algorithm. Our detector is able to perform a detection every 450 ms, with an overall testing F1 score of 83%.
arXiv Detail & Related papers (2020-04-03T17:36:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.