Physically Feasible Semantic Segmentation
- URL: http://arxiv.org/abs/2408.14672v3
- Date: Sun, 19 Jan 2025 19:03:04 GMT
- Title: Physically Feasible Semantic Segmentation
- Authors: Shamik Basu, Luc Van Gool, Christos Sakaridis,
- Abstract summary: State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion, minimizing solely per-pixel or per-segment classification objectives on their training data.
This purely data-driven paradigm often leads to absurd segmentations, especially when the domain of input images is shifted from the one encountered during training.
Our method, Physically Feasible Semantic (PhyFea), first extracts explicit constraints that govern spatial class relations from the semantic segmentation training set at hand in an offline data-driven fashion, and then enforces a morphological yet differentiable loss that penalizes violations of these constraints during
- Score: 58.17907376475596
- License:
- Abstract: State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion, minimizing solely per-pixel or per-segment classification objectives on their training data. This purely data-driven paradigm often leads to absurd segmentations, especially when the domain of input images is shifted from the one encountered during training. For instance, state-of-the-art models may assign the label ``road to a segment that is located above a segment that is respectively labeled as ``sky, although our knowledge of the physical world dictates that such a configuration is not feasible for images captured by forward-facing upright cameras. Our method, Physically Feasible Semantic Segmentation (PhyFea), first extracts explicit constraints that govern spatial class relations from the semantic segmentation training set at hand in an offline, data-driven fashion, and then enforces a morphological yet differentiable loss that penalizes violations of these constraints during training to promote prediction feasibility. PhyFea is a plug-and-play method and yields consistent and significant performance improvements over diverse state-of-the-art networks on which we implement it across the ADE20K, Cityscapes, and ACDC datasets. Code and models will be made publicly available.
Related papers
- Placing Objects in Context via Inpainting for Out-of-distribution Segmentation [59.00092709848619]
Placing Objects in Context (POC) is a pipeline to realistically add objects to an image.
POC can be used to extend any dataset with an arbitrary number of objects.
We present different anomaly segmentation datasets based on POC-generated data and show that POC can improve the performance of recent state-of-the-art anomaly fine-tuning methods.
arXiv Detail & Related papers (2024-02-26T08:32:41Z) - Learning from SAM: Harnessing a Foundation Model for Sim2Real Adaptation by Regularization [17.531847357428454]
Domain adaptation is especially important for robotics applications, where target domain training data is usually scarce and annotations are costly to obtain.
We present a method for self-supervised domain adaptation for the scenario where annotated source domain data is available.
Our method targets the semantic segmentation task and leverages a segmentation foundation model (Segment Anything Model) to obtain segment information on unannotated data.
arXiv Detail & Related papers (2023-09-27T10:37:36Z) - Stochastic Segmentation with Conditional Categorical Diffusion Models [3.8168879948759953]
We propose a conditional categorical diffusion model (CCDM) for semantic segmentation based on Denoising Diffusion Probabilistic Models.
Our results show that CCDM achieves state-of-the-art performance on LIDC, and outperforms established baselines on the classical segmentation dataset Cityscapes.
arXiv Detail & Related papers (2023-03-15T19:16:47Z) - Unsupervised Continual Semantic Adaptation through Neural Rendering [32.099350613956716]
We study continual multi-scene adaptation for the task of semantic segmentation.
We propose training a Semantic-NeRF network for each scene by fusing the predictions of a segmentation model.
We evaluate our approach on ScanNet, where we outperform both a voxel-based baseline and a state-of-the-art unsupervised domain adaptation method.
arXiv Detail & Related papers (2022-11-25T09:31:41Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Points2Polygons: Context-Based Segmentation from Weak Labels Using
Adversarial Networks [0.0]
In applied image segmentation tasks, the ability to provide numerous and precise labels for training is paramount to the accuracy of the model at inference time.
This overhead is often neglected, and recently proposed segmentation architectures rely heavily on the availability and fidelity of ground truth labels to achieve state-of-the-art accuracies.
We introduce Points2Polygons (P2P), a model which makes use of contextual metric learning techniques that directly addresses this problem.
arXiv Detail & Related papers (2021-06-05T05:17:45Z) - Towards Adaptive Semantic Segmentation by Progressive Feature Refinement [16.40758125170239]
We propose an innovative progressive feature refinement framework, along with domain adversarial learning to boost the transferability of segmentation networks.
As a result, the segmentation models trained with source domain images can be transferred to a target domain without significant performance degradation.
arXiv Detail & Related papers (2020-09-30T04:17:48Z) - Transferring and Regularizing Prediction for Semantic Segmentation [115.88957139226966]
In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer.
We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion.
Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
arXiv Detail & Related papers (2020-06-11T16:19:41Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - Phase Consistent Ecological Domain Adaptation [76.75730500201536]
We focus on the task of semantic segmentation, where annotated synthetic data are aplenty, but annotating real data is laborious.
The first criterion, inspired by visual psychophysics, is that the map between the two image domains be phase-preserving.
The second criterion aims to leverage ecological statistics, or regularities in the scene which are manifest in any image of it, regardless of the characteristics of the illuminant or the imaging sensor.
arXiv Detail & Related papers (2020-04-10T06:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.