Learning Cross-Spectral Point Features with Task-Oriented Training
- URL: http://arxiv.org/abs/2505.12593v2
- Date: Wed, 21 May 2025 02:48:29 GMT
- Title: Learning Cross-Spectral Point Features with Task-Oriented Training
- Authors: Mia Thomas, Trevor Ablett, Jonathan Kelly,
- Abstract summary: This work explores learned cross-spectral (thermal-visible) point features as a means to integrate thermal imagery into established camera-based navigation systems.<n>We run our feature network on thermal-visible image pairs, then feed the network response into a differentiable registration pipeline.<n>Our selected model, trained on the task of matching, achieves a registration error below 10 pixels for more than 75% of estimates on the MultiPoint dataset.
- Score: 7.403657686829926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unmanned aerial vehicles (UAVs) enable operations in remote and hazardous environments, yet the visible-spectrum, camera-based navigation systems often relied upon by UAVs struggle in low-visibility conditions. Thermal cameras, which capture long-wave infrared radiation, are able to function effectively in darkness and smoke, where visible-light cameras fail. This work explores learned cross-spectral (thermal-visible) point features as a means to integrate thermal imagery into established camera-based navigation systems. Existing methods typically train a feature network's detection and description outputs directly, which often focuses training on image regions where thermal and visible-spectrum images exhibit similar appearance. Aiming to more fully utilize the available data, we propose a method to train the feature network on the tasks of matching and registration. We run our feature network on thermal-visible image pairs, then feed the network response into a differentiable registration pipeline. Losses are applied to the matching and registration estimates of this pipeline. Our selected model, trained on the task of matching, achieves a registration error (corner error) below 10 pixels for more than 75% of estimates on the MultiPoint dataset. We further demonstrate that our model can also be used with a classical pipeline for matching and registration.
Related papers
- A Resource Efficient Fusion Network for Object Detection in Bird's-Eye View using Camera and Raw Radar Data [7.2508100569856975]
We use the raw range-Doppler spectrum of radar data to process camera images.
We extract the corresponding features with our camera encoder-decoder architecture.
The resultant feature maps are fused with Range-Azimuth features, recovered from the RD spectrum input to perform object detection.
arXiv Detail & Related papers (2024-11-20T13:26:13Z) - UAVs and Neural Networks for search and rescue missions [0.0]
We present a method for detecting objects of interest, including cars, humans, and fire, in aerial images captured by unmanned aerial vehicles (UAVs)
To achieve this, we use artificial neural networks and create a dataset for supervised learning.
arXiv Detail & Related papers (2023-10-09T08:27:35Z) - Unsupervised Wildfire Change Detection based on Contrastive Learning [1.53934570513443]
The accurate characterization of the severity of the wildfire event contributes to the characterization of the fuel conditions in fire-prone areas.
The aim of this study is to develop an autonomous system built on top of high-resolution multispectral satellite imagery, with an advanced deep learning method for detecting burned area change.
arXiv Detail & Related papers (2022-11-26T20:13:14Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - ReDFeat: Recoupling Detection and Description for Multimodal Feature
Learning [51.07496081296863]
We recouple independent constraints of detection and description of multimodal feature learning with a mutual weighting strategy.
We propose a detector that possesses a large receptive field and is equipped with learnable non-maximum suppression layers.
We build a benchmark that contains cross visible, infrared, near-infrared and synthetic aperture radar image pairs for evaluating the performance of features in feature matching and image registration tasks.
arXiv Detail & Related papers (2022-05-16T04:24:22Z) - An Empirical Study of Remote Sensing Pretraining [117.90699699469639]
We conduct an empirical study of remote sensing pretraining (RSP) on aerial images.
RSP can help deliver distinctive performances in scene recognition tasks.
RSP mitigates the data discrepancies of traditional ImageNet pretraining on RS images, but it may still suffer from task discrepancies.
arXiv Detail & Related papers (2022-04-06T13:38:11Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Comparison of Object Detection Algorithms Using Video and Thermal Images
Collected from a UAS Platform: An Application of Drones in Traffic Management [2.9932638148627104]
This study explores real-time vehicle detection algorithms on both visual and infrared cameras.
Red Green Blue (RGB) videos and thermal images were collected from a UAS platform along highways in the Tampa, Florida, area.
arXiv Detail & Related papers (2021-09-27T16:57:09Z) - MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking [72.65494220685525]
We propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data.
We generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism.
arXiv Detail & Related papers (2021-07-22T03:10:51Z) - Learned Camera Gain and Exposure Control for Improved Visual Feature
Detection and Matching [12.870196901446208]
We explore a data-driven approach to account for environmental lighting changes, improving the quality of images for use in visual odometry (VO) or visual simultaneous localization and mapping (SLAM)
We train a deep convolutional neural network model to predictively adjust camera gain and exposure time parameters.
We demonstrate through extensive real-world experiments that our network can anticipate and compensate for dramatic lighting changes.
arXiv Detail & Related papers (2021-02-08T16:46:09Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.