Related papers: Aerial Image Stitching Using IMU Data from a UAV

Aerial Image Stitching Using IMU Data from a UAV

URL: http://arxiv.org/abs/2511.06841v1
Date: Mon, 10 Nov 2025 08:33:00 GMT
Title: Aerial Image Stitching Using IMU Data from a UAV
Authors: Selim Ahmet Iz, Mustafa Unel,
Abstract summary: Unmanned Aerial Vehicles (UAVs) are widely used for aerial photography and remote sensing applications.<n>One of the main challenges is to stitch together multiple images into a single high-resolution image that covers a large area.<n>We present a novel method that uses a combination of IMU data and computer vision techniques for stitching images captured by a UAV.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Unmanned Aerial Vehicles (UAVs) are widely used for aerial photography and remote sensing applications. One of the main challenges is to stitch together multiple images into a single high-resolution image that covers a large area. Featurebased image stitching algorithms are commonly used but can suffer from errors and ambiguities in feature detection and matching. To address this, several approaches have been proposed, including using bundle adjustment techniques or direct image alignment. In this paper, we present a novel method that uses a combination of IMU data and computer vision techniques for stitching images captured by a UAV. Our method involves several steps such as estimating the displacement and rotation of the UAV between consecutive images, correcting for perspective distortion, and computing a homography matrix. We then use a standard image stitching algorithm to align and blend the images together. Our proposed method leverages the additional information provided by the IMU data, corrects for various sources of distortion, and can be easily integrated into existing UAV workflows. Our experiments demonstrate the effectiveness and robustness of our method, outperforming some of the existing feature-based image stitching algorithms in terms of accuracy and reliability, particularly in challenging scenarios such as large displacements, rotations, and variations in camera pose.

Related papers

Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors [58.75916798814376]
We develop a few-shot anomaly detector termed FoundAD.<n>We observe that the anomaly amount in an image directly correlates with the difference in the learnt embeddings.<n>The simple operator acts as an effective tool for anomaly detection to characterize and identify out-of-distribution regions in an image.
arXiv Detail & Related papers (2025-10-02T11:53:20Z)
EdgeRegNet: Edge Feature-based Multimodal Registration Network between Images and LiDAR Point Clouds [10.324549723042338]
Cross-modal data registration has long been a critical task in computer vision.<n>We propose a method that uses edge information from the original point clouds and images for cross-modal registration.<n>We validate our method on the KITTI and nuScenes datasets, demonstrating its state-of-the-art performance.
arXiv Detail & Related papers (2025-03-19T15:03:41Z)
A Global Depth-Range-Free Multi-View Stereo Transformer Network with Pose Embedding [76.44979557843367]
We propose a novel multi-view stereo (MVS) framework that gets rid of the depth range prior.<n>We introduce a Multi-view Disparity Attention (MDA) module to aggregate long-range context information.<n>We explicitly estimate the quality of the current pixel corresponding to sampled points on the epipolar line of the source image.
arXiv Detail & Related papers (2024-11-04T08:50:16Z)
Multi-Sensor Diffusion-Driven Optical Image Translation for Large-Scale Applications [3.4085512042262374]
We propose a method that super-resolves large-scale low spatial resolution images into high-resolution equivalents from disparate optical sensors.<n>Our approach provides precise domain adaptation, preserving image content while improving radiometric accuracy and feature representation.<n>We reach a mean Learned Perceptual Image Patch Similarity (mLPIPS) of 0.1884 and a Fr'echet Inception Distance (FID) of 45.64, expressively outperforming all compared methods.
arXiv Detail & Related papers (2024-04-17T10:49:00Z)
DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection. It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z)
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference. This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion. The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z)
Deep Learning-Based UAV Aerial Triangulation without Image Control Points [6.335630432207172]
How to realize the large-scale mapping of UAV image-free control supported by POS faces many technical problems. Deep learning has surpassed the performance of traditional handcrafted features in many aspects. This paper proposes a new drone image registration method based on deep learning image features.
arXiv Detail & Related papers (2023-01-07T15:01:38Z)
LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography. Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges. We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z)
A Hierarchical Transformation-Discriminating Generative Model for Few Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image. The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z)
IMU-Assisted Learning of Single-View Rolling Shutter Correction [16.242924916178282]
Rolling shutter distortion is highly undesirable for photography and computer vision algorithms. We propose a deep neural network to predict depth and row-wise pose from a single image for rolling shutter correction.
arXiv Detail & Related papers (2020-11-05T21:33:25Z)
Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models. Image rectification, which aims to correct these distortions, can solve these problems. We present a detailed description and discussion of the camera models used in different approaches. Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z)
Leveraging Photogrammetric Mesh Models for Aerial-Ground Feature Point Matching Toward Integrated 3D Reconstruction [19.551088857830944]
Integration of aerial and ground images has been proved as an efficient approach to enhance the surface reconstruction in urban environments. Previous studies based on geometry-aware image rectification have alleviated this problem. We propose a novel approach: leveraging photogrammetric mesh models for aerial-ground image matching.
arXiv Detail & Related papers (2020-02-21T01:47:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.