Semi-Supervised 360 Layout Estimation with Panoramic Collaborative Perturbations
- URL: http://arxiv.org/abs/2503.01114v1
- Date: Mon, 03 Mar 2025 02:49:20 GMT
- Title: Semi-Supervised 360 Layout Estimation with Panoramic Collaborative Perturbations
- Authors: Junsong Zhang, Chunyu Lin, Zhijie Shen, Lang Nie, Kang Liao, Yao Zhao,
- Abstract summary: We propose a novel semi-supervised method named Semi360, which incorporates the priors of the panoramic layout and distortion through collaborative perturbations.<n>Our experimental results on three mainstream benchmarks demonstrate that the proposed method offers significant advantages over existing state-of-the-art (SoTA) solutions.
- Score: 56.84921040837699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance of existing supervised layout estimation methods heavily relies on the quality of data annotations. However, obtaining large-scale and high-quality datasets remains a laborious and time-consuming challenge. To solve this problem, semi-supervised approaches are introduced to relieve the demand for expensive data annotations by encouraging the consistent results of unlabeled data with different perturbations. However, existing solutions merely employ vanilla perturbations, ignoring the characteristics of panoramic layout estimation. In contrast, we propose a novel semi-supervised method named SemiLayout360, which incorporates the priors of the panoramic layout and distortion through collaborative perturbations. Specifically, we leverage the panoramic layout prior to enhance the model's focus on potential layout boundaries. Meanwhile, we introduce the panoramic distortion prior to strengthen distortion awareness. Furthermore, to prevent intense perturbations from hindering model convergence and ensure the effectiveness of prior-based perturbations, we divide and reorganize them as panoramic collaborative perturbations. Our experimental results on three mainstream benchmarks demonstrate that the proposed method offers significant advantages over existing state-of-the-art (SoTA) solutions.
Related papers
- Semi-Supervised Fine-Tuning of Vision Foundation Models with Content-Style Decomposition [4.192370959537781]
We present a semi-supervised fine-tuning approach designed to improve the performance of pre-trained foundation models on downstream tasks with limited labeled data.
We evaluate our approach on multiple datasets, including MNIST, its augmented variations, CIFAR-10, SVHN, and GalaxyMNIST.
arXiv Detail & Related papers (2024-10-02T22:36:12Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo [8.303396507129266]
MSP-MVS is a method introducing multi-granularity segmentation prior to edge-confined patch deformation.
We implement equidistribution and disassemble-clustering of correlative reliable pixels.
We also introduce disparity-sampling synergistic 3D optimization to help identify global-minimum matching costs.
arXiv Detail & Related papers (2024-07-27T19:00:44Z) - Prototype Clustered Diffusion Models for Versatile Inverse Problems [11.55838697574475]
We show that the measurement-based likelihood can be renovated with restoration-based likelihood via the opposite probabilistic graphic direction.
We can resolve inverse problems with bunch of choices for assorted sample quality and realize the proficient deterioration control with assured realistic.
arXiv Detail & Related papers (2024-07-13T04:24:53Z) - Latent Embedding Clustering for Occlusion Robust Head Pose Estimation [7.620379605206596]
Head pose estimation has become a crucial area of research in computer vision given its usefulness in a wide range of applications.
One of the most difficult challenges in this field is managing head occlusions that frequently take place in real-world scenarios.
We propose a novel and efficient framework that is robust in real world head occlusion scenarios.
arXiv Detail & Related papers (2024-03-29T15:57:38Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - LoLep: Single-View View Synthesis with Locally-Learned Planes and
Self-Attention Occlusion Inference [66.45326873274908]
We propose a novel method, LoLep, which regresses Locally-Learned planes from a single RGB image to represent scenes accurately.
Compared to MINE, our approach has an LPIPS reduction of 4.8%-9.0% and an RV reduction of 73.9%-83.5%.
arXiv Detail & Related papers (2023-07-23T03:38:55Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Supercharging Imbalanced Data Learning With Energy-based Contrastive
Representation Transfer [72.5190560787569]
In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets.
Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions.
This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes.
arXiv Detail & Related papers (2020-11-25T00:13:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.