Related papers: Placing Objects in Context via Inpainting for Out-of-distribution Segmentation

Placing Objects in Context via Inpainting for Out-of-distribution Segmentation

URL: http://arxiv.org/abs/2402.16392v2
Date: Fri, 12 Jul 2024 18:19:19 GMT
Title: Placing Objects in Context via Inpainting for Out-of-distribution Segmentation
Authors: Pau de Jorge, Riccardo Volpi, Puneet K. Dokania, Philip H. S. Torr, Gregory Rogez,
Abstract summary: Placing Objects in Context (POC) is a pipeline to realistically add objects to an image. POC can be used to extend any dataset with an arbitrary number of objects. We present different anomaly segmentation datasets based on POC-generated data and show that POC can improve the performance of recent state-of-the-art anomaly fine-tuning methods.
Score: 59.00092709848619
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When deploying a semantic segmentation model into the real world, it will inevitably encounter semantic classes that were not seen during training. To ensure a safe deployment of such systems, it is crucial to accurately evaluate and improve their anomaly segmentation capabilities. However, acquiring and labelling semantic segmentation data is expensive and unanticipated conditions are long-tail and potentially hazardous. Indeed, existing anomaly segmentation datasets capture a limited number of anomalies, lack realism or have strong domain shifts. In this paper, we propose the Placing Objects in Context (POC) pipeline to realistically add any object into any image via diffusion models. POC can be used to easily extend any dataset with an arbitrary number of objects. In our experiments, we present different anomaly segmentation datasets based on POC-generated data and show that POC can improve the performance of recent state-of-the-art anomaly fine-tuning methods across several standardized benchmarks. POC is also effective for learning new classes. For example, we utilize it to augment Cityscapes samples by incorporating a subset of Pascal classes and demonstrate that models trained on such data achieve comparable performance to the Pascal-trained baseline. This corroborates the low synth2real gap of models trained on POC-generated images. Code: https://github.com/naver/poc

Related papers

SAMPO: Visual Preference Optimization for Intent-Aware Segmentation with Vision Foundation Models [5.3279948735247284]
We introduce SAMPO, a novel framework that teaches visual foundation models to infer high-level categorical intent from sparse visual interactions.<n>Our work establishes a new paradigm for intent-aware alignment in visual foundation models, removing dependencies on auxiliary prompt generators or language-model-assisted preference learning.
arXiv Detail & Related papers (2025-08-04T14:31:11Z)
Training-Free Dataset Pruning for Instance Segmentation [35.124251909622025]
Instance segmentation presents three key challenges: pixel-level annotations, instance area variations, and class imbalances. We propose a novel Training-Free dataset Pruning (TFDP) method for instance segmentation. We achieve state-of-the-art results on VOC 2012, Cityscapes, and COCO datasets, generalizing well across CNN and Transformer architectures.
arXiv Detail & Related papers (2025-03-02T10:05:59Z)
Unsupervised Class Generation to Expand Semantic Segmentation Datasets [9.144948836224078]
We introduce novel samples into the training data without modifications to the underlying algorithms. We show how models can not only effectively learn how to segment novel classes, with an average performance of 51% IoU, but also reduce errors for other, already existing classes.
arXiv Detail & Related papers (2025-01-04T11:53:13Z)
Physically Feasible Semantic Segmentation [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion. Our method, Physically Feasible Semantic (PhyFea), extracts explicit physical constraints that govern spatial class relations. PhyFea yields significant performance improvements in mIoU over each state-of-the-art network we use.
arXiv Detail & Related papers (2024-08-26T22:39:08Z)
Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS) We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z)
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z)
Detecting Anomalies in Semantic Segmentation with Prototypes [23.999211737485812]
We propose to address anomaly segmentation through prototype learning. Our approach achieves the new state of the art, with a significant margin over previous works.
arXiv Detail & Related papers (2021-06-01T13:22:33Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
Context-self contrastive pretraining for crop type semantic segmentation [39.81074867563505]
The proposed Context-Self Contrastive Loss (CSCL) learns an embedding space that makes semantic boundaries pop-up. For crop type semantic segmentation from Satellite Image Time Series (SITS) we find performance at parcel boundaries to be a critical bottleneck. We present a process for semantic segmentation at super-resolution for obtaining crop classes at a more granular level.
arXiv Detail & Related papers (2021-04-09T11:29:44Z)
Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU [87.48110331544885]
We propose a novel training methodology to train and scale the existing semantic segmentation models. We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
arXiv Detail & Related papers (2020-12-14T13:12:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.