A Deep Single Image Rectification Approach for Pan-Tilt-Zoom Cameras
- URL: http://arxiv.org/abs/2504.06965v1
- Date: Wed, 09 Apr 2025 15:19:38 GMT
- Title: A Deep Single Image Rectification Approach for Pan-Tilt-Zoom Cameras
- Authors: Teng Xiao, Qi Hu, Qingsong Yan, Wei Liu, Zhiwei Ye, Fei Deng,
- Abstract summary: This paper presents a Forward Distortion and Backward Warping Network (FDBWNet), a novel framework for wide-grained image rectification.<n>It begins by using a forward distortion model synthesize barreldistorted images, reducing pixel redundancy and preventing blur.<n>The network employs a pyramid context with attention mechanisms generate backward warping flows containing geometric details.
- Score: 21.404790627439954
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pan-Tilt-Zoom (PTZ) cameras with wide-angle lenses are widely used in surveillance but often require image rectification due to their inherent nonlinear distortions. Current deep learning approaches typically struggle to maintain fine-grained geometric details, resulting in inaccurate rectification. This paper presents a Forward Distortion and Backward Warping Network (FDBW-Net), a novel framework for wide-angle image rectification. It begins by using a forward distortion model to synthesize barrel-distorted images, reducing pixel redundancy and preventing blur. The network employs a pyramid context encoder with attention mechanisms to generate backward warping flows containing geometric details. Then, a multi-scale decoder is used to restore distorted features and output rectified images. FDBW-Net's performance is validated on diverse datasets: public benchmarks, AirSim-rendered PTZ camera imagery, and real-scene PTZ camera datasets. It demonstrates that FDBW-Net achieves SOTA performance in distortion rectification, boosting the adaptability of PTZ cameras for practical visual applications.
Related papers
- An End-to-End Real-World Camera Imaging Pipeline [26.595914212462183]
We propose an end-to-end camera imaging pipeline (RealCamNet) to enhance real-world camera imaging performance.
RealCamNet is designed for high-quality conversion from RAW to RGB and compact image compression.
Experiment results show that RealCamNet achieves the best rate-distortion performance with lower inference latency.
arXiv Detail & Related papers (2024-11-16T11:19:03Z) - DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture [13.412728770638465]
We present an encoder-decoder model that adapts to distortions in wide-angle lenses by leveraging the physical characteristics defined by the radial distortion profile.
In contrast to the original model, which only performs classification tasks, we introduce a U-Net architecture, DarSwin-Unet, designed for pixel level tasks.
Our approach enhances the model capability to handle pixel-level tasks in wide-angle fisheye images, making it more effective for real-world applications.
arXiv Detail & Related papers (2024-07-24T14:52:18Z) - RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation [88.54817424560056]
We propose a distortion vector map (DVM) that measures the degree and direction of local distortion.
By learning the DVM, the model can independently identify local distortions at each pixel without relying on global distortion patterns.
In the pre-training stage, it predicts the distortion vector map and perceives the local distortion features of each pixel.
In the fine-tuning stage, it predicts a pixel-wise flow map for deviated fisheye image rectification.
arXiv Detail & Related papers (2024-06-27T06:38:56Z) - Möbius Transform for Mitigating Perspective Distortions in Representation Learning [43.86985901138407]
Perspective distortion (PD) causes unprecedented changes in shape, size, orientation, angles, and other spatial relationships in images.
We propose mitigating perspective distortion (MPD) by employing a fine-grained parameter control on a specific family of M"obius transform.
We present a dedicated perspectively distorted benchmark dataset, ImageNet-PD, to benchmark the robustness of deep learning models against this new dataset.
arXiv Detail & Related papers (2024-03-07T15:39:00Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline
Model and DoF-based Curriculum Learning [62.86400614141706]
We propose a new learning model, i.e., Rectangling Rectification Network (RecRecNet)
Our model can flexibly warp the source structure to the target domain and achieves an end-to-end unsupervised deformation.
Experiments show the superiority of our solution over the compared methods on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2023-01-04T15:12:57Z) - Deep Rotation Correction without Angle Prior [57.76737888499145]
We propose a new and practical task, named Rotation Correction, to automatically correct the tilt with high content fidelity.
This task can be easily integrated into image editing applications, allowing users to correct the rotated images without any manual operations.
We leverage a neural network to predict the optical flows that can warp the tilted images to be perceptually horizontal.
arXiv Detail & Related papers (2022-07-07T02:46:27Z) - Single Image Automatic Radial Distortion Compensation Using Deep
Convolutional Network [0.12891210250935145]
We present a novel method for single-image automatic lens distortion compensation based on deep convolutional neural networks.
The method is capable of real-time performance and accuracy using two highest-order coefficients of the radial distortion model operating in the application domain of sports broadcast.
arXiv Detail & Related papers (2021-12-14T13:04:03Z) - TransCamP: Graph Transformer for 6-DoF Camera Pose Estimation [77.09542018140823]
We propose a neural network approach with a graph transformer backbone, namely TransCamP, to address the camera relocalization problem.
TransCamP effectively fuses the image features, camera pose information and inter-frame relative camera motions into encoded graph attributes.
arXiv Detail & Related papers (2021-05-28T19:08:43Z) - Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models.
Image rectification, which aims to correct these distortions, can solve these problems.
We present a detailed description and discussion of the camera models used in different approaches.
Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z) - UnRectDepthNet: Self-Supervised Monocular Depth Estimation using a
Generic Framework for Handling Common Camera Distortion Models [8.484676769284578]
We propose a generic scale-aware self-supervised pipeline for estimating depth, euclidean distance, and visual odometry from unrectified monocular videos.
The proposed algorithm is evaluated further on the KITTI rectified dataset, and we achieve state-of-the-art results.
arXiv Detail & Related papers (2020-07-13T20:35:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.