Placing Objects in Context via Inpainting for Out-of-distribution Segmentation
- URL: http://arxiv.org/abs/2402.16392v2
- Date: Fri, 12 Jul 2024 18:19:19 GMT
- Title: Placing Objects in Context via Inpainting for Out-of-distribution Segmentation
- Authors: Pau de Jorge, Riccardo Volpi, Puneet K. Dokania, Philip H. S. Torr, Gregory Rogez,
- Abstract summary: Placing Objects in Context (POC) is a pipeline to realistically add objects to an image.
POC can be used to extend any dataset with an arbitrary number of objects.
We present different anomaly segmentation datasets based on POC-generated data and show that POC can improve the performance of recent state-of-the-art anomaly fine-tuning methods.
- Score: 59.00092709848619
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When deploying a semantic segmentation model into the real world, it will inevitably encounter semantic classes that were not seen during training. To ensure a safe deployment of such systems, it is crucial to accurately evaluate and improve their anomaly segmentation capabilities. However, acquiring and labelling semantic segmentation data is expensive and unanticipated conditions are long-tail and potentially hazardous. Indeed, existing anomaly segmentation datasets capture a limited number of anomalies, lack realism or have strong domain shifts. In this paper, we propose the Placing Objects in Context (POC) pipeline to realistically add any object into any image via diffusion models. POC can be used to easily extend any dataset with an arbitrary number of objects. In our experiments, we present different anomaly segmentation datasets based on POC-generated data and show that POC can improve the performance of recent state-of-the-art anomaly fine-tuning methods across several standardized benchmarks. POC is also effective for learning new classes. For example, we utilize it to augment Cityscapes samples by incorporating a subset of Pascal classes and demonstrate that models trained on such data achieve comparable performance to the Pascal-trained baseline. This corroborates the low synth2real gap of models trained on POC-generated images. Code: https://github.com/naver/poc
Related papers
- Unsupervised Class Generation to Expand Semantic Segmentation Datasets [9.144948836224078]
We introduce novel samples into the training data without modifications to the underlying algorithms.
We show how models can not only effectively learn how to segment novel classes, with an average performance of 51% IoU, but also reduce errors for other, already existing classes.
arXiv Detail & Related papers (2025-01-04T11:53:13Z) - Physically Feasible Semantic Segmentation [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion, minimizing solely per-pixel or per-segment classification objectives on their training data.
This purely data-driven paradigm often leads to absurd segmentations, especially when the domain of input images is shifted from the one encountered during training.
Our method, Physically Feasible Semantic (PhyFea), first extracts explicit constraints that govern spatial class relations from the semantic segmentation training set at hand in an offline data-driven fashion, and then enforces a morphological yet differentiable loss that penalizes violations of these constraints during
arXiv Detail & Related papers (2024-08-26T22:39:08Z) - Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Context-self contrastive pretraining for crop type semantic segmentation [39.81074867563505]
The proposed Context-Self Contrastive Loss (CSCL) learns an embedding space that makes semantic boundaries pop-up.
For crop type semantic segmentation from Satellite Image Time Series (SITS) we find performance at parcel boundaries to be a critical bottleneck.
We present a process for semantic segmentation at super-resolution for obtaining crop classes at a more granular level.
arXiv Detail & Related papers (2021-04-09T11:29:44Z) - Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU [87.48110331544885]
We propose a novel training methodology to train and scale the existing semantic segmentation models.
We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
arXiv Detail & Related papers (2020-12-14T13:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.