BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
- URL: http://arxiv.org/abs/2503.19340v1
- Date: Tue, 25 Mar 2025 04:15:20 GMT
- Title: BADGR: Bundle Adjustment Diffusion Conditioned by GRadients for Wide-Baseline Floor Plan Reconstruction
- Authors: Yuguang Li, Ivaylo Boyadzhiev, Zixuan Liu, Linda Shapiro, Alex Colburn,
- Abstract summary: BADGR is a novel diffusion model that jointly performs reconstruction and bundle adjustment (BA)<n>BadGR is conditioned on dense per-entity outputs from a single-step Levenberg Marquardt (LM)<n>Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art pose and floor plan layout reconstruction with different input densities.
- Score: 9.490537585601395
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reconstructing precise camera poses and floor plan layouts from wide-baseline RGB panoramas is a difficult and unsolved problem. We introduce BADGR, a novel diffusion model that jointly performs reconstruction and bundle adjustment (BA) to refine poses and layouts from a coarse state, using 1D floor boundary predictions from dozens of images of varying input densities. Unlike a guided diffusion model, BADGR is conditioned on dense per-entity outputs from a single-step Levenberg Marquardt (LM) optimizer and is trained to predict camera and wall positions while minimizing reprojection errors for view-consistency. The objective of layout generation from denoising diffusion process complements BA optimization by providing additional learned layout-structural constraints on top of the co-visible features across images. These constraints help BADGR to make plausible guesses on spatial relations which help constrain pose graph, such as wall adjacency, collinearity, and learn to mitigate errors from dense boundary observations with global contexts. BADGR trains exclusively on 2D floor plans, simplifying data acquisition, enabling robust augmentation, and supporting variety of input densities. Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art pose and floor plan layout reconstruction with different input densities.
Related papers
- PIDLoc: Cross-View Pose Optimization Network Inspired by PID Controllers [7.582581416640314]
PIDLoc is a novel cross-view pose optimization approach inspired by the proportional-integral-derivative (PID) controller.<n>The PIDLoc achieves state-of-the-art performance in cross-view pose estimation for the KITTI dataset, reducing position error by $37.8%$ compared with the previous state-of-the-art.
arXiv Detail & Related papers (2025-03-04T08:24:08Z) - Boundary Attention Constrained Zero-Shot Layout-To-Image Generation [47.435234391588494]
Recent text-to-image diffusion models excel at generating high-resolution images from text but struggle with precise control over spatial composition and object counting.
We propose a novel zero-shot L2I approach, BACON, which eliminates the need for additional modules or fine-tuning.
We leverage pixel-to-pixel correlations in the self-attention feature maps to align cross-attention maps and combine three loss functions constrained by boundary attention to update latent features.
arXiv Detail & Related papers (2024-11-15T05:44:45Z) - Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction [4.227116189483428]
This study introduces a novel Cascaded Diffusion with Discrepancy Mitigation framework.
It includes the low-quality image generation in latent space and the high-quality image generation in pixel space.
It minimizes computational costs by moving some inference steps from pixel space to latent space.
arXiv Detail & Related papers (2024-03-14T12:58:28Z) - SA-MixNet: Structure-aware Mixup and Invariance Learning for
Scribble-supervised Road Extraction in Remote Sensing Images [85.52629779976137]
We propose a structure-aware Mixup scheme to paste road regions from one image onto another for creating an image scene with increased complexity.
A discriminator-based regularization is designed for enhancing the connectivity meanwhile preserving the structure of roads.
Our framework demonstrates superior performance on the DeepGlobe, Wuhan, and Massachusetts datasets.
arXiv Detail & Related papers (2024-03-03T02:56:43Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - GIFD: A Generative Gradient Inversion Method with Feature Domain
Optimization [52.55628139825667]
Federated Learning (FL) has emerged as a promising distributed machine learning framework to preserve clients' privacy.
Recent studies find that an attacker can invert the shared gradients and recover sensitive data against an FL system by leveraging pre-trained generative adversarial networks (GAN) as prior knowledge.
We propose textbfGradient textbfInversion over textbfFeature textbfDomains (GIFD), which disassembles the GAN model and searches the feature domains of the intermediate layers.
arXiv Detail & Related papers (2023-08-09T04:34:21Z) - A Unified Conditional Framework for Diffusion-based Image Restoration [39.418415473235235]
We present a unified conditional framework based on diffusion models for image restoration.
We leverage a lightweight UNet to predict initial guidance and the diffusion model to learn the residual of the guidance.
To handle high-resolution images, we propose a simple yet effective inter-step patch-splitting strategy.
arXiv Detail & Related papers (2023-05-31T17:22:24Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Generative Diffusion Prior for Unified Image Restoration and Enhancement [62.76390152617949]
Existing image restoration methods mostly leverage the posterior distribution of natural images.
We propose the Generative Diffusion Prior (GDP) to effectively model the posterior distributions in an unsupervised sampling manner.
GDP utilizes a pre-train denoising diffusion generative model (DDPM) for solving linear inverse, non-linear, or blind problems.
arXiv Detail & Related papers (2023-04-03T16:52:43Z) - Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction [11.215334675788952]
This paper presents a semantic planar SLAM system that improves pose estimation and mapping using cues from an instance planar segmentation network.
While the mainstream approaches are using RGB-D sensors, employing a monocular camera with such a system still faces challenges such as robust data association and precise geometric model fitting.
arXiv Detail & Related papers (2021-08-09T18:16:08Z) - Light Field Spatial Super-resolution via Deep Combinatorial Geometry
Embedding and Structural Consistency Regularization [99.96632216070718]
Light field (LF) images acquired by hand-held devices usually suffer from low spatial resolution.
The high-dimensional spatiality characteristic and complex geometrical structure of LF images make the problem more challenging than traditional single-image SR.
We propose a novel learning-based LF framework, in which each view of an LF image is first individually super-resolved.
arXiv Detail & Related papers (2020-04-05T14:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.