Transferring and Regularizing Prediction for Semantic Segmentation
- URL: http://arxiv.org/abs/2006.06570v1
- Date: Thu, 11 Jun 2020 16:19:41 GMT
- Title: Transferring and Regularizing Prediction for Semantic Segmentation
- Authors: Yiheng Zhang and Zhaofan Qiu and Ting Yao and Chong-Wah Ngo and Dong
Liu and Tao Mei
- Abstract summary: In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer.
We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion.
Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
- Score: 115.88957139226966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation often requires a large set of images with pixel-level
annotations. In the view of extremely expensive expert labeling, recent
research has shown that the models trained on photo-realistic synthetic data
(e.g., computer games) with computer-generated annotations can be adapted to
real images. Despite this progress, without constraining the prediction on real
images, the models will easily overfit on synthetic data due to severe domain
mismatch. In this paper, we novelly exploit the intrinsic properties of
semantic segmentation to alleviate such problem for model transfer.
Specifically, we present a Regularizer of Prediction Transfer (RPT) that
imposes the intrinsic properties as constraints to regularize model transfer in
an unsupervised fashion. These constraints include patch-level, cluster-level
and context-level semantic prediction consistencies at different levels of
image formation. As the transfer is label-free and data-driven, the robustness
of prediction is addressed by selectively involving a subset of image regions
for model regularization. Extensive experiments are conducted to verify the
proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA
(synthetic data) to Cityscapes dataset (urban street scenes). RPT shows
consistent improvements when injecting the constraints on several neural
networks for semantic segmentation. More remarkably, when integrating RPT into
the adversarial-based segmentation framework, we report to-date the best
results: mIoU of 53.2%/51.7% when transferring from GTA5/SYNTHIA to Cityscapes,
respectively.
Related papers
- Physically Feasible Semantic Segmentation [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion.
Our method, Physically Feasible Semantic (PhyFea), extracts explicit physical constraints that govern spatial class relations.
PhyFea yields significant performance improvements in mIoU over each state-of-the-art network we use.
arXiv Detail & Related papers (2024-08-26T22:39:08Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Counterfactual contrastive learning: robust representations via causal image synthesis [17.273155534515393]
CF-SimCLR is a counterfactual contrastive learning approach which leverages approximate counterfactual inference for positive pair creation.
We show that CF-SimCLR substantially improves acquisition shift with higher downstream performance on both in- and out-of-distribution data.
arXiv Detail & Related papers (2024-03-14T17:47:01Z) - Scaling Laws of Synthetic Images for Model Training ... for Now [54.43596959598466]
We study the scaling laws of synthetic images generated by state of the art text-to-image models.
We observe that synthetic images demonstrate a scaling trend similar to, but slightly less effective than, real images in CLIP training.
arXiv Detail & Related papers (2023-12-07T18:59:59Z) - Self-Supervised and Semi-Supervised Polyp Segmentation using Synthetic
Data [16.356954231068077]
Early detection of colorectal polyps is of utmost importance for their treatment and for colorectal cancer prevention.
Computer vision techniques have the potential to aid professionals in the diagnosis stage, where colonoscopies are manually carried out to examine the entirety of the patient's colon.
The main challenge in medical imaging is the lack of data, and a further challenge specific to polyp segmentation approaches is the difficulty of manually labeling the available data.
We propose an end-to-end model for polyp segmentation that integrates real and synthetic data to artificially increase the size of the datasets and aid the training when unlabeled samples are available.
arXiv Detail & Related papers (2023-07-22T09:57:58Z) - Towards Automated Polyp Segmentation Using Weakly- and Semi-Supervised
Learning and Deformable Transformers [8.01814397869811]
Polyp segmentation is a crucial step towards computer-aided diagnosis of colorectal cancer.
Most of the polyp segmentation methods require pixel-wise annotated datasets.
We propose a novel framework that can be trained using only weakly annotated images along with exploiting unlabeled images.
arXiv Detail & Related papers (2022-11-21T20:44:12Z) - High-resolution semantically-consistent image-to-image translation [0.0]
This paper proposes an unsupervised domain adaptation model that preserves semantic consistency and per-pixel quality for the images during the style-transferring phase.
The proposed model shows substantial performance gain compared to the SemI2I model and reaches similar results as the state-of-the-art CyCADA model.
arXiv Detail & Related papers (2022-09-13T19:08:30Z) - Semi-weakly Supervised Contrastive Representation Learning for Retinal
Fundus Images [0.2538209532048867]
We propose a semi-weakly supervised contrastive learning framework for representation learning using semi-weakly annotated images.
We empirically validate the transfer learning performance of SWCL on seven public retinal fundus datasets.
arXiv Detail & Related papers (2021-08-04T15:50:09Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.