AlignDiff: Learning Physically-Grounded Camera Alignment via Diffusion
- URL: http://arxiv.org/abs/2503.21581v1
- Date: Thu, 27 Mar 2025 14:59:59 GMT
- Title: AlignDiff: Learning Physically-Grounded Camera Alignment via Diffusion
- Authors: Liuyue Xie, Jiancong Guo, Ozan Cakmakci, Andre Araujo, Laszlo A. Jeni, Zhiheng Jia,
- Abstract summary: We introduce a novel framework that addresses camera intrinsic and extrinsic parameters using a generic ray camera model.<n>Unlike previous approaches, AlignDiff shifts focus from semantic to geometric features, enabling more accurate modeling of local distortions.<n>Our experiments demonstrate that the proposed method significantly reduces the angular error of estimated ray bundles by 8.2 degrees and overall calibration accuracy, outperforming existing approaches on challenging, real-world datasets.
- Score: 0.5277756703318045
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurate camera calibration is a fundamental task for 3D perception, especially when dealing with real-world, in-the-wild environments where complex optical distortions are common. Existing methods often rely on pre-rectified images or calibration patterns, which limits their applicability and flexibility. In this work, we introduce a novel framework that addresses these challenges by jointly modeling camera intrinsic and extrinsic parameters using a generic ray camera model. Unlike previous approaches, AlignDiff shifts focus from semantic to geometric features, enabling more accurate modeling of local distortions. We propose AlignDiff, a diffusion model conditioned on geometric priors, enabling the simultaneous estimation of camera distortions and scene geometry. To enhance distortion prediction, we incorporate edge-aware attention, focusing the model on geometric features around image edges, rather than semantic content. Furthermore, to enhance generalizability to real-world captures, we incorporate a large database of ray-traced lenses containing over three thousand samples. This database characterizes the distortion inherent in a diverse variety of lens forms. Our experiments demonstrate that the proposed method significantly reduces the angular error of estimated ray bundles by ~8.2 degrees and overall calibration accuracy, outperforming existing approaches on challenging, real-world datasets.
Related papers
- PRaDA: Projective Radial Distortion Averaging [40.77624901787694]
We tackle the problem of automatic calibration of radially distorted cameras in challenging conditions.
Our proposed method, Projective Radial Distortion Averaging, averages multiple distortion estimates in a fully projective framework.
arXiv Detail & Related papers (2025-04-23T08:22:59Z) - SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model [63.685132323224124]
Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains.
In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges.
Experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.
arXiv Detail & Related papers (2024-03-15T06:26:46Z) - Single-image camera calibration with model-free distortion correction [0.0]
This paper proposes a method for estimating the complete set of calibration parameters from a single image of a planar speckle pattern covering the entire sensor.
The correspondence between image points and physical points on the calibration target is obtained using Digital Image Correlation.
At the end of the procedure, a dense and uniform model-free distortion map is obtained over the entire image.
arXiv Detail & Related papers (2024-03-02T16:51:35Z) - Cameras as Rays: Pose Estimation via Ray Diffusion [54.098613859015856]
Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views.
We propose a distributed representation of camera pose that treats a camera as a bundle of rays.
Our proposed methods, both regression- and diffusion-based, demonstrate state-of-the-art performance on camera pose estimation on CO3D.
arXiv Detail & Related papers (2024-02-22T18:59:56Z) - Curved Diffusion: A Generative Model With Optical Geometry Control [56.24220665691974]
The influence of different optical systems on the final scene appearance is frequently overlooked.
This study introduces a framework that intimately integrates a textto-image diffusion model with the particular lens used in image rendering.
arXiv Detail & Related papers (2023-11-29T13:06:48Z) - How to turn your camera into a perfect pinhole model [0.38233569758620056]
We propose a novel approach that involves a pre-processing step to remove distortions from images.
Our method does not need to assume any distortion model and can be applied to severely warped images.
This model allows for a serious upgrade of many algorithms and applications.
arXiv Detail & Related papers (2023-09-20T13:54:29Z) - Neural Lens Modeling [50.57409162437732]
NeuroLens is a neural lens model for distortion and vignetting that can be used for point projection and ray casting.
It can be used to perform pre-capture calibration using classical calibration targets, and can later be used to perform calibration or refinement during 3D reconstruction.
The model generalizes across many lens types and is trivial to integrate into existing 3D reconstruction and rendering systems.
arXiv Detail & Related papers (2023-04-10T20:09:17Z) - A Model for Multi-View Residual Covariances based on Perspective
Deformation [88.21738020902411]
We derive a model for the covariance of the visual residuals in multi-view SfM, odometry and SLAM setups.
We validate our model with synthetic and real data and integrate it into photometric and feature-based Bundle Adjustment.
arXiv Detail & Related papers (2022-02-01T21:21:56Z) - Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models.
Image rectification, which aims to correct these distortions, can solve these problems.
We present a detailed description and discussion of the camera models used in different approaches.
Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.