Deep Fusion of Ultra-Low-Resolution Thermal Camera and Gyroscope Data for Lighting-Robust and Compute-Efficient Rotational Odometry
- URL: http://arxiv.org/abs/2506.12536v1
- Date: Sat, 14 Jun 2025 15:23:40 GMT
- Title: Deep Fusion of Ultra-Low-Resolution Thermal Camera and Gyroscope Data for Lighting-Robust and Compute-Efficient Rotational Odometry
- Authors: Farida Mohsen, Ali Safa,
- Abstract summary: This study introduces thermal-gyro fusion, a novel sensor fusion approach that integrates ultra-low-resolution thermal imaging with gyroscope readings for rotational odometry.<n>Our analysis demonstrates that thermal-gyro fusion enables a significant reduction in thermal camera resolution without significantly compromising accuracy.<n>These advantages make our approach well-suited for real-time deployment in resource-constrained robotic systems.
- Score: 1.1838866556981258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate rotational odometry is crucial for autonomous robotic systems, particularly for small, power-constrained platforms such as drones and mobile robots. This study introduces thermal-gyro fusion, a novel sensor fusion approach that integrates ultra-low-resolution thermal imaging with gyroscope readings for rotational odometry. Unlike RGB cameras, thermal imaging is invariant to lighting conditions and, when fused with gyroscopic data, mitigates drift which is a common limitation of inertial sensors. We first develop a multimodal data acquisition system to collect synchronized thermal and gyroscope data, along with rotational speed labels, across diverse environments. Subsequently, we design and train a lightweight Convolutional Neural Network (CNN) that fuses both modalities for rotational speed estimation. Our analysis demonstrates that thermal-gyro fusion enables a significant reduction in thermal camera resolution without significantly compromising accuracy, thereby improving computational efficiency and memory utilization. These advantages make our approach well-suited for real-time deployment in resource-constrained robotic systems. Finally, to facilitate further research, we publicly release our dataset as supplementary material.
Related papers
- SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities [14.157338282165037]
Spike cameras, bio-inspired vision sensors, asynchronously fire by accumulating light intensities at each pixel, offering exceptional resolution spikes.<n>This work contributes a dataset that will drive research in energy-efficient, ultra-low-power video understanding, specifically for action recognition using spike-based data.
arXiv Detail & Related papers (2025-07-22T01:59:14Z) - Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity.<n> multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions.<n>Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z) - Rotational Odometry using Ultra Low Resolution Thermal Cameras [1.3986052523534573]
This letter provides what is, to the best of our knowledge, a first study on the applicability of ultra-low-resolution thermal cameras for rotational odometry measurements.
Our use of an ultra-low-resolution thermal camera instead of other modalities such as an RGB camera is motivated by its robustness to lighting conditions.
Experiments and ablation studies are conducted for determining the impact of thermal camera resolution and the number of successive frames on the CNN estimation precision.
arXiv Detail & Related papers (2024-11-02T12:15:32Z) - PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data [0.5461938536945721]
PowerYOLO is a mixed precision solution to the problem of fitting algorithms of high memory and computational complexity into small low-power devices.
First, we propose a system based on a Dynamic Vision Sensor (DVS), a novel sensor, that offers low power requirements.
Second, to ensure high accuracy and low memory and computational complexity, we propose to use 4-bit width Powers-of-Two (PoT) quantisation.
Third, we replace multiplication with bit-shifting to increase the efficiency of hardware acceleration of such solution.
arXiv Detail & Related papers (2024-07-11T08:17:35Z) - Toward Efficient Visual Gyroscopes: Spherical Moments, Harmonics Filtering, and Masking Techniques for Spherical Camera Applications [83.8743080143778]
A visual gyroscope estimates camera rotation through images.
The integration of omnidirectional cameras, offering a larger field of view compared to traditional RGB cameras, has proven to yield more accurate and robust results.
Here, we address these challenges by introducing a novel visual gyroscope, which combines an Efficient Multi-Mask-Filter Rotation Estor and a Learning based optimization.
arXiv Detail & Related papers (2024-04-02T13:19:06Z) - Resource-Efficient Gesture Recognition using Low-Resolution Thermal
Camera via Spiking Neural Networks and Sparse Segmentation [1.7758299835471887]
This work proposes a novel approach for hand gesture recognition using an inexpensive, low-resolution (24 x 32) thermal sensor.
Compared to the use of standard RGB cameras, the proposed system is insensitive to lighting variations.
This paper shows that the innovative use of the recently proposed Monostable Multivibrator (MMV) neural networks as a new class of SNN achieves more than one order of magnitude smaller memory and compute complexity.
arXiv Detail & Related papers (2024-01-12T13:20:01Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Multi-Visual-Inertial System: Analysis, Calibration and Estimation [26.658649118048032]
We study state estimation of multi-visual-inertial systems (MVIS) and develop sensor fusion algorithms.
We are interested in the full calibration of the associated visual-inertial sensors.
arXiv Detail & Related papers (2023-08-10T02:47:36Z) - Simultaneous temperature estimation and nonuniformity correction from
multiple frames [0.0]
Low-cost microbolometer-based IR cameras are prone to spatially nonuniformity and to drift in temperature measurements.
We propose a novel approach for simultaneous temperature estimation and nonuniformity correction (NUC) from multiple frames captured by low-cost microbolometer cameras.
arXiv Detail & Related papers (2023-07-23T11:28:25Z) - LIF-Seg: LiDAR and Camera Image Fusion for 3D LiDAR Semantic
Segmentation [78.74202673902303]
We propose a coarse-tofine LiDAR and camera fusion-based network (termed as LIF-Seg) for LiDAR segmentation.
The proposed method fully utilizes the contextual information of images and introduces a simple but effective early-fusion strategy.
The cooperation of these two components leads to the success of the effective camera-LiDAR fusion.
arXiv Detail & Related papers (2021-08-17T08:53:11Z) - A parameter refinement method for Ptychography based on Deep Learning
concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability.
A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction.
We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z) - Learning Camera Miscalibration Detection [83.38916296044394]
This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras.
Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric.
By training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.
arXiv Detail & Related papers (2020-05-24T10:32:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.