Saliency Guided Image Warping for Unsupervised Domain Adaptation
- URL: http://arxiv.org/abs/2403.12712v2
- Date: Wed, 31 Jul 2024 02:33:31 GMT
- Title: Saliency Guided Image Warping for Unsupervised Domain Adaptation
- Authors: Shen Zheng, Anurag Ghosh, Srinivasa G. Narasimhan,
- Abstract summary: We improve UDA training by using in-place image warping to focus on salient object regions.
We design instance-level saliency guidance to adaptively oversample object regions.
Our approach improves adaptation across geographies, lighting, and weather conditions.
- Score: 19.144094571994756
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Driving is challenging in conditions like night, rain, and snow. The lack of good labeled datasets has hampered progress in scene understanding under such conditions. Unsupervised domain adaptation (UDA) using large labeled clear-day datasets is a promising research direction in such cases. Current UDA methods, however, treat all image pixels uniformly, leading to over-reliance on the dominant scene backgrounds (e.g., roads, sky, sidewalks) that appear dramatically different across domains. As a result, they struggle to learn effective features of smaller and often sparse foreground objects (e.g., people, vehicles, signs). In this work, we improve UDA training by using in-place image warping to focus on salient object regions. Our insight is that while backgrounds vary significantly across domains (e.g., snowy night vs. clear day), object appearances vary to a lesser extent. Therefore, we design instance-level saliency guidance to adaptively oversample object regions, which reduces adverse effects from background context and enhances backbone feature learning. We then unwarp the better learned features while adapting from source to target. Our approach improves adaptation across geographies, lighting, and weather conditions, and is agnostic to the task (segmentation, detection), domain adaptation algorithm, saliency guidance, and underlying model architecture. Result highlights include +6.1 mAP50 for BDD100K Clear $\rightarrow$ DENSE Foggy, +3.7 mAP50 for BDD100K Day $\rightarrow$ Night, +3.0 mAP50 for BDD100K Clear $\rightarrow$ Rainy, and +6.3 mIoU for Cityscapes $\rightarrow$ ACDC. Our method adds minimal training memory and incurs no additional inference latency. Please see Appendix for more results and analysis.
Related papers
- Enhancing Autonomous Vehicle Perception in Adverse Weather through Image Augmentation during Semantic Segmentation Training [0.0]
We trained encoder-decoder UNet models to perform semantic segmentation augmentations.
Models trained on weather data have significantly lower losses than those trained on augmented data in all conditions except for clear days.
arXiv Detail & Related papers (2024-08-14T00:08:28Z) - RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception [98.76525636842177]
RoScenes is the largest multi-view roadside perception dataset.
Our dataset achieves surprising 21.13M 3D annotations within 64,000 $m2$.
arXiv Detail & Related papers (2024-05-16T08:06:52Z) - DTBS: Dual-Teacher Bi-directional Self-training for Domain Adaptation in
Nighttime Semantic Segmentation [1.7205106391379026]
Nighttime conditions pose a significant challenge for autonomous vehicle perception systems.
Unsupervised domain adaptation (UDA) has been widely applied to semantic segmentation on such images.
We introduce a one-stage Dual-Teacher Bi-directional Self-training (DTBS) framework for smooth knowledge transfer and feedback.
arXiv Detail & Related papers (2024-01-02T06:56:57Z) - The Change You Want to See (Now in 3D) [65.61789642291636]
The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene.
We contribute a change detection model that is trained entirely on synthetic data and is class-agnostic.
We release a new evaluation dataset consisting of real-world image pairs with human-annotated differences.
arXiv Detail & Related papers (2023-08-21T01:59:45Z) - MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation [104.40114562948428]
In unsupervised domain adaptation (UDA), a model trained on source data (e.g. synthetic) is adapted to target data (e.g. real-world) without access to target annotation.
We propose a Masked Image Consistency (MIC) module to enhance UDA by learning spatial context relations of the target domain.
MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA.
arXiv Detail & Related papers (2022-12-02T17:29:32Z) - Improving Pixel-Level Contrastive Learning by Leveraging Exogenous Depth
Information [7.561849435043042]
Self-supervised representation learning based on Contrastive Learning (CL) has been the subject of much attention in recent years.
In this paper we will focus on the depth information, which can be obtained by using a depth network or measured from available data.
We show that using this estimation information in the contrastive loss leads to improved results and that the learned representations better follow the shapes of objects.
arXiv Detail & Related papers (2022-11-18T11:45:39Z) - HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic
Segmentation [104.47737619026246]
Unsupervised domain adaptation (UDA) aims to adapt a model trained on the source domain to the target domain.
We propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details.
It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA-to-Cityscapes and 4.9 mIoU for Synthia-to-Cityscapes.
arXiv Detail & Related papers (2022-04-27T18:00:26Z) - Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place
Recognition and Localization [9.834635805575584]
We contribute with the emphDanish Airs and Grounds dataset, a large collection of street-level and aerial images targeting such cases.
The dataset is larger and more diverse than current publicly available data, including more than 50 km of road in urban, suburban and rural areas.
We propose a map-to-image re-localization pipeline, that first estimates a dense 3D reconstruction from the aerial images and then matches query street-level images to street-level renderings of the 3D model.
arXiv Detail & Related papers (2022-02-03T19:58:09Z) - SF-UDA$^{3D}$: Source-Free Unsupervised Domain Adaptation for
LiDAR-Based 3D Object Detection [66.63707940938012]
3D object detectors based only on LiDAR point clouds hold the state-of-the-art on modern street-view benchmarks.
This paper proposes SF-UDA$3D$ to domain-adapt the state-of-the-art PointRCNN 3D detector to target domains for which we have no annotations.
arXiv Detail & Related papers (2020-10-16T08:44:49Z) - BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in
Unstructured Driving Environments [54.22535063244038]
We present an unsupervised adaptation approach for visual scene understanding in unstructured traffic environments.
Our method is designed for unstructured real-world scenarios with dense and heterogeneous traffic consisting of cars, trucks, two-and three-wheelers, and pedestrians.
arXiv Detail & Related papers (2020-09-22T08:25:44Z) - Keep it Simple: Image Statistics Matching for Domain Adaptation [0.0]
Domain Adaptation (DA) is a technique to maintain detection accuracy when only unlabeled images are available of the target domain.
Recent state-of-the-art methods try to reduce the domain gap using an adversarial training strategy.
We propose to align either color histograms or mean and covariance of the source images towards the target domain.
In comparison to recent methods, we achieve state-of-the-art performance using a much simpler procedure for the training.
arXiv Detail & Related papers (2020-05-26T07:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.