Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation
- URL: http://arxiv.org/abs/2403.12505v2
- Date: Fri, 22 Mar 2024 15:41:20 GMT
- Title: Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation
- Authors: Xu Zheng, Pengyuan Zhou, Athanasios V. Vasilakos, Lin Wang,
- Abstract summary: This paper addresses a problem -- source-free unsupervised domain adaptation (SFUDA) for pinhole-to-panoramic semantic segmentation.
Tackling this problem is nontrivial due to the semantic mismatches, style discrepancies, and inevitable distortion of panoramic images.
We propose a novel method that utilizes Tangent Projection (TP) as it has less distortion and slits the equirectangular projection (ERP) with a fixed FoV to mimic the pinhole images.
- Score: 15.367186190755003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses an interesting yet challenging problem -- source-free unsupervised domain adaptation (SFUDA) for pinhole-to-panoramic semantic segmentation -- given only a pinhole image-trained model (i.e., source) and unlabeled panoramic images (i.e., target). Tackling this problem is nontrivial due to the semantic mismatches, style discrepancies, and inevitable distortion of panoramic images. To this end, we propose a novel method that utilizes Tangent Projection (TP) as it has less distortion and meanwhile slits the equirectangular projection (ERP) with a fixed FoV to mimic the pinhole images. Both projections are shown effective in extracting knowledge from the source model. However, the distinct projection discrepancies between source and target domains impede the direct knowledge transfer; thus, we propose a panoramic prototype adaptation module (PPAM) to integrate panoramic prototypes from the extracted knowledge for adaptation. We then impose the loss constraints on both predictions and prototypes and propose a cross-dual attention module (CDAM) at the feature level to better align the spatial and channel characteristics across the domains and projections. Both knowledge extraction and transfer processes are synchronously updated to reach the best performance. Extensive experiments on the synthetic and real-world benchmarks, including outdoor and indoor scenarios, demonstrate that our method achieves significantly better performance than prior SFUDA methods for pinhole-to-panoramic adaptation.
Related papers
- Mind the Gap Between Prototypes and Images in Cross-domain Finetuning [64.97317635355124]
We propose a contrastive prototype-image adaptation (CoPA) to adapt different transformations respectively for prototypes and images.
Experiments on Meta-Dataset demonstrate that CoPA achieves the state-of-the-art performance more efficiently.
arXiv Detail & Related papers (2024-10-16T11:42:11Z) - Reducing Semantic Ambiguity In Domain Adaptive Semantic Segmentation Via Probabilistic Prototypical Pixel Contrast [7.092718945468069]
Domain adaptation aims to reduce the model degradation on the target domain caused by the domain shift between the source and target domains.
Probabilistic proto-typical pixel contrast (PPPC) is a universal adaptation framework that models each pixel embedding as a probability.
PPPC not only helps to address ambiguity at the pixel level, yielding discriminative representations but also significant improvements in both synthetic-to-real and day-to-night adaptation tasks.
arXiv Detail & Related papers (2024-09-27T08:25:03Z) - 360SFUDA++: Towards Source-free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes [15.367186190755003]
We address the challenging source-free unsupervised domain adaptation (SFUDA) for pinhole-to-panoramic semantic segmentation.
360SFUDA++ effectively extracts knowledge from the source pinhole model with only unlabeled panoramic images.
arXiv Detail & Related papers (2024-04-25T10:52:08Z) - Forgery-aware Adaptive Transformer for Generalizable Synthetic Image
Detection [106.39544368711427]
We study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods.
We present a novel forgery-aware adaptive transformer approach, namely FatFormer.
Our approach tuned on 4-class ProGAN data attains an average of 98% accuracy to unseen GANs, and surprisingly generalizes to unseen diffusion models with 95% accuracy.
arXiv Detail & Related papers (2023-12-27T17:36:32Z) - Look at the Neighbor: Distortion-aware Unsupervised Domain Adaptation
for Panoramic Semantic Segmentation [5.352137021024213]
The aim is to tackle the domain gaps caused by the style disparities and distortion problem from the non-uniformly distributed pixels of equirectangular projection (ERP)
We propose a novel UDA framework that can effectively address the distortion problems for panoramic semantic segmentation.
arXiv Detail & Related papers (2023-08-10T10:47:12Z) - Both Style and Distortion Matter: Dual-Path Unsupervised Domain
Adaptation for Panoramic Semantic Segmentation [4.566642023113164]
The ability of scene understanding has sparked active research for panoramic image semantic segmentation.
Some works treat the equirectangular projection (ERP) and pinhole images equally and transfer knowledge from the pinhole to ERP images via unsupervised domain adaptation (UDA)
We propose a novel yet flexible dual-path UDA framework, DPPASS, taking ERP and tangent projection (TP) images as inputs.
arXiv Detail & Related papers (2023-03-25T04:57:45Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Feature Alignment by Uncertainty and Self-Training for Source-Free
Unsupervised Domain Adaptation [1.6498361958317636]
Most unsupervised domain adaptation (UDA) methods assume that labeled source images are available during model adaptation.
We propose a source-free UDA method that uses only a pre-trained source model and unlabeled target images.
Our method captures the aleatoric uncertainty by incorporating data augmentation and trains the feature generator with two consistency objectives.
arXiv Detail & Related papers (2022-08-31T14:28:36Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Semi-Supervised Domain Adaptation with Prototypical Alignment and
Consistency Learning [86.6929930921905]
This paper studies how much it can help address domain shifts if we further have a few target samples labeled.
To explore the full potential of landmarks, we incorporate a prototypical alignment (PA) module which calculates a target prototype for each class from the landmarks.
Specifically, we severely perturb the labeled images, making PA non-trivial to achieve and thus promoting model generalizability.
arXiv Detail & Related papers (2021-04-19T08:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.