Multi-source Domain Adaptation for Panoramic Semantic Segmentation
- URL: http://arxiv.org/abs/2408.16469v2
- Date: Tue, 07 Jan 2025 12:02:22 GMT
- Title: Multi-source Domain Adaptation for Panoramic Semantic Segmentation
- Authors: Jing Jiang, Sicheng Zhao, Jiankun Zhu, Wenbo Tang, Zhaopan Xu, Jidong Yang, Guoping Liu, Tengfei Xing, Pengfei Xu, Hongxun Yao,
- Abstract summary: Methods for panoramic semantic segmentation utilize real pinhole images or low-cost synthetic panoramic images to transfer segmentation models to real panoramic images.
MSDA4PASS uses both real pinhole and synthetic panoramic images to improve segmentation on unlabeled real panoramic images.
MSDA4PASS consists of two main components: Unpaired Semantic Morphing (USM) and Distortion Gating Alignment (DGA)
- Score: 21.6293634368587
- License:
- Abstract: Unsupervised domain adaptation methods for panoramic semantic segmentation utilize real pinhole images or low-cost synthetic panoramic images to transfer segmentation models to real panoramic images. However, these methods struggle to understand the panoramic structure using only real pinhole images and lack real-world scene perception with only synthetic panoramic images. Therefore, in this paper, we propose a new task, Multi-source Domain Adaptation for Panoramic Semantic Segmentation (MSDA4PASS), which leverages both real pinhole and synthetic panoramic images to improve segmentation on unlabeled real panoramic images. There are two key issues in the MSDA4PASS task: (1) distortion gaps between the pinhole and panoramic domains -- panoramic images exhibit global and local distortions absent in pinhole images; (2) texture gaps between the source and target domains -- scenes and styles differ across domains. To address these two issues, we propose a novel framework, Deformation Transform Aligner for Panoramic Semantic Segmentation (DTA4PASS), which converts all pinhole images in the source domains into distorted images and aligns the source distorted and panoramic images with the target panoramic images. Specifically, DTA4PASS consists of two main components: Unpaired Semantic Morphing (USM) and Distortion Gating Alignment (DGA). First, in USM, the Dual-view Discriminator (DvD) assists in training the diffeomorphic deformation network at the image and pixel level, enabling the effective deformation transformation of pinhole images without paired panoramic views, alleviating distortion gaps. Second, DGA assigns pinhole-like (pin-like) and panoramic-like (pan-like) features to each image by gating, and aligns these two features through uncertainty estimation, reducing texture gaps.
Related papers
- DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - Progressive Retinal Image Registration via Global and Local Deformable Transformations [49.032894312826244]
We propose a hybrid registration framework called HybridRetina.
We use a keypoint detector and a deformation network called GAMorph to estimate the global transformation and local deformable transformation.
Experiments on two widely-used datasets, FIRE and FLoRI21, show that our proposed HybridRetina significantly outperforms some state-of-the-art methods.
arXiv Detail & Related papers (2024-09-02T08:43:50Z) - PanoSwin: a Pano-style Swin Transformer for Panorama Understanding [15.115868803355081]
equirectangular projection (ERP) entails boundary discontinuity and spatial distortion.
We propose PanoSwin to learn panorama representations with ERP.
We conduct experiments against the state-of-the-art on various panoramic tasks.
arXiv Detail & Related papers (2023-08-28T17:30:14Z) - PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline
Panoramas [54.4948540627471]
We propose PanoGRF, Generalizable Spherical Radiance Fields for Wide-baseline Panoramas.
Unlike generalizable radiance fields trained on perspective images, PanoGRF avoids the information loss from panorama-to-perspective conversion.
Results on multiple panoramic datasets demonstrate that PanoGRF significantly outperforms state-of-the-art generalizable view synthesis methods.
arXiv Detail & Related papers (2023-06-02T13:35:07Z) - Panoramic Image-to-Image Translation [37.9486466936501]
We tackle the challenging task of Panoramic Image-to-Image translation (Pano-I2I) for the first time.
This task is difficult due to the geometric distortion of panoramic images and the lack of a panoramic image dataset with diverse conditions, like weather or time.
We propose a panoramic distortion-aware I2I model that preserves the structure of the panoramic images while consistently translating their global style referenced from a pinhole image.
arXiv Detail & Related papers (2023-04-11T04:08:58Z) - Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation [73.48323921632506]
We address panoramic semantic segmentation which is under-explored due to two critical challenges.
First, we propose an upgraded Transformer for Panoramic Semantic, i.e., Trans4PASS+, equipped with Deformable Patch Embedding (DPE) and Deformable (DMLPv2) modules.
Second, we enhance the Mutual Prototypical Adaptation (MPA) strategy via pseudo-label rectification for unsupervised domain adaptive panoramic segmentation.
Third, aside from Pinhole-to-Panoramic (Pin2Pan) adaptation, we create a new dataset (SynPASS) with 9,080 panoramic images
arXiv Detail & Related papers (2022-07-25T00:42:38Z) - Cross-View Panorama Image Synthesis [68.35351563852335]
PanoGAN is a novel adversarial feedback GAN framework named.
PanoGAN enables high-quality panorama image generation with more convincing details than state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-22T15:59:44Z) - Bending Reality: Distortion-aware Transformers for Adapting to Panoramic
Semantic Segmentation [26.09267582056609]
Large quantities of expensive, pixel-wise annotations are crucial for success of robust panoramic segmentation models.
Distortions and the distinct image-feature distribution in 360-degree panoramas impede the transfer from the annotation-rich pinhole domain.
We learn object deformations and panoramic image distortions in Deformable Patch Embedding (DPE) and Deformable Deformable (DMLP) components.
Finally, we tie together shared semantics in pinhole- and panoramic feature embeddings by generating multi-scale prototype features.
arXiv Detail & Related papers (2022-03-02T23:00:32Z) - DensePASS: Dense Panoramic Semantic Segmentation via Unsupervised Domain
Adaptation with Attention-Augmented Context Exchange [32.29797061415896]
We formalize the task of unsupervised domain adaptation for panoramic semantic segmentation.
A network trained on labelled examples from the source domain of pinhole camera data is deployed in a different target domain of panoramic images.
We build a generic framework for cross-domain panoramic semantic segmentation based on different variants of attention-augmented domain adaptation modules.
arXiv Detail & Related papers (2021-08-13T20:15:46Z) - Panoramic Panoptic Segmentation: Towards Complete Surrounding
Understanding via Unsupervised Contrastive Learning [97.37544023666833]
We introduce panoramic panoptic segmentation as the most holistic scene understanding.
A complete surrounding understanding provides a maximum of information to the agent.
We propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
arXiv Detail & Related papers (2021-03-01T09:37:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.