Both Style and Distortion Matter: Dual-Path Unsupervised Domain
Adaptation for Panoramic Semantic Segmentation
- URL: http://arxiv.org/abs/2303.14360v1
- Date: Sat, 25 Mar 2023 04:57:45 GMT
- Title: Both Style and Distortion Matter: Dual-Path Unsupervised Domain
Adaptation for Panoramic Semantic Segmentation
- Authors: Xu Zheng, Jinjing Zhu, Yexin Liu, Zidong Cao, Chong Fu, Lin Wang
- Abstract summary: The ability of scene understanding has sparked active research for panoramic image semantic segmentation.
Some works treat the equirectangular projection (ERP) and pinhole images equally and transfer knowledge from the pinhole to ERP images via unsupervised domain adaptation (UDA)
We propose a novel yet flexible dual-path UDA framework, DPPASS, taking ERP and tangent projection (TP) images as inputs.
- Score: 4.566642023113164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability of scene understanding has sparked active research for panoramic
image semantic segmentation. However, the performance is hampered by distortion
of the equirectangular projection (ERP) and a lack of pixel-wise annotations.
For this reason, some works treat the ERP and pinhole images equally and
transfer knowledge from the pinhole to ERP images via unsupervised domain
adaptation (UDA). However, they fail to handle the domain gaps caused by: 1)
the inherent differences between camera sensors and captured scenes; 2) the
distinct image formats (e.g., ERP and pinhole images). In this paper, we
propose a novel yet flexible dual-path UDA framework, DPPASS, taking ERP and
tangent projection (TP) images as inputs. To reduce the domain gaps, we propose
cross-projection and intra-projection training. The cross-projection training
includes tangent-wise feature contrastive training and prediction consistency
training. That is, the former formulates the features with the same projection
locations as positive examples and vice versa, for the models' awareness of
distortion, while the latter ensures the consistency of cross-model predictions
between the ERP and TP. Moreover, adversarial intra-projection training is
proposed to reduce the inherent gap, between the features of the pinhole images
and those of the ERP and TP images, respectively. Importantly, the TP path can
be freely removed after training, leading to no additional inference cost.
Extensive experiments on two benchmarks show that our DPPASS achieves +1.06$\%$
mIoU increment than the state-of-the-art approaches.
Related papers
- 360SFUDA++: Towards Source-free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes [15.367186190755003]
We address the challenging source-free unsupervised domain adaptation (SFUDA) for pinhole-to-panoramic semantic segmentation.
360SFUDA++ effectively extracts knowledge from the source pinhole model with only unlabeled panoramic images.
arXiv Detail & Related papers (2024-04-25T10:52:08Z) - Learning to Rank Patches for Unbiased Image Redundancy Reduction [80.93989115541966]
Images suffer from heavy spatial redundancy because pixels in neighboring regions are spatially correlated.
Existing approaches strive to overcome this limitation by reducing less meaningful image regions.
We propose a self-supervised framework for image redundancy reduction called Learning to Rank Patches.
arXiv Detail & Related papers (2024-03-31T13:12:41Z) - Semantics, Distortion, and Style Matter: Towards Source-free UDA for Panoramic Segmentation [15.367186190755003]
This paper addresses a problem -- source-free unsupervised domain adaptation (SFUDA) for pinhole-to-panoramic semantic segmentation.
Tackling this problem is nontrivial due to the semantic mismatches, style discrepancies, and inevitable distortion of panoramic images.
We propose a novel method that utilizes Tangent Projection (TP) as it has less distortion and slits the equirectangular projection (ERP) with a fixed FoV to mimic the pinhole images.
arXiv Detail & Related papers (2024-03-19T07:11:53Z) - Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive [21.49096276631859]
Current L2I models either suffer from poor editability via text or weak alignment between the generated image and the input layout.
We propose to integrate adversarial supervision into the conventional training pipeline of L2I diffusion models (ALDM)
Specifically, we employ a segmentation-based discriminator which provides explicit feedback to the diffusion generator on the pixel-level alignment between the denoised image and the input layout.
arXiv Detail & Related papers (2024-01-16T20:31:46Z) - Look at the Neighbor: Distortion-aware Unsupervised Domain Adaptation
for Panoramic Semantic Segmentation [5.352137021024213]
The aim is to tackle the domain gaps caused by the style disparities and distortion problem from the non-uniformly distributed pixels of equirectangular projection (ERP)
We propose a novel UDA framework that can effectively address the distortion problems for panoramic semantic segmentation.
arXiv Detail & Related papers (2023-08-10T10:47:12Z) - PiPa: Pixel- and Patch-wise Self-supervised Learning for Domain
Adaptative Semantic Segmentation [100.6343963798169]
Unsupervised Domain Adaptation (UDA) aims to enhance the generalization of the learned model to other domains.
We propose a unified pixel- and patch-wise self-supervised learning framework, called PiPa, for domain adaptive semantic segmentation.
arXiv Detail & Related papers (2022-11-14T18:31:24Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - PIT: Position-Invariant Transform for Cross-FoV Domain Adaptation [53.428312630479816]
We observe that the Field of View (FoV) gap induces noticeable instance appearance differences between the source and target domains.
Motivated by the observations, we propose the textbfPosition-Invariant Transform (PIT) to better align images in different domains.
arXiv Detail & Related papers (2021-08-16T15:16:47Z) - DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic
Segmentation [97.74059510314554]
Unsupervised domain adaptation (UDA) for semantic segmentation aims to adapt a segmentation model trained on the labeled source domain to the unlabeled target domain.
Existing methods try to learn domain invariant features while suffering from large domain gaps.
We propose a novel Dual Soft-Paste (DSP) method in this paper.
arXiv Detail & Related papers (2021-07-20T16:22:40Z) - Unsupervised Bidirectional Cross-Modality Adaptation via Deeply
Synergistic Image and Feature Alignment for Medical Image Segmentation [73.84166499988443]
We present a novel unsupervised domain adaptation framework, named as Synergistic Image and Feature Alignment (SIFA)
Our proposed SIFA conducts synergistic alignment of domains from both image and feature perspectives.
Experimental results on two different tasks demonstrate that our SIFA method is effective in improving segmentation performance on unlabeled target images.
arXiv Detail & Related papers (2020-02-06T13:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.