Open-World Panoptic Segmentation
        - URL: http://arxiv.org/abs/2412.12740v1
- Date: Tue, 17 Dec 2024 10:03:39 GMT
- Title: Open-World Panoptic Segmentation
- Authors: Matteo Sodano, Federico Magistri, Jens Behley, Cyrill Stachniss, 
- Abstract summary: We propose Con2MAV, an approach for open-world panoptic segmentation.<n>We show that our model achieves state-of-the-art results on open-world segmentation tasks.<n>We also propose PANIC, a benchmark for evaluating open-world panoptic segmentation in autonomous driving scenarios.
- Score: 31.799000996671975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Perception is a key building block of autonomously acting vision systems such as autonomous vehicles. It is crucial that these systems are able to understand their surroundings in order to operate safely and robustly. Additionally, autonomous systems deployed in unconstrained real-world scenarios must be able of dealing with novel situations and object that have never been seen before. In this article, we tackle the problem of open-world panoptic segmentation, i.e., the task of discovering new semantic categories and new object instances at test time, while enforcing consistency among the categories that we incrementally discover. We propose Con2MAV, an approach for open-world panoptic segmentation that extends our previous work, ContMAV, which was developed for open-world semantic segmentation. Through extensive experiments across multiple datasets, we show that our model achieves state-of-the-art results on open-world segmentation tasks, while still performing competitively on the known categories. We will open-source our implementation upon acceptance. Additionally, we propose PANIC (Panoptic ANomalies In Context), a benchmark for evaluating open-world panoptic segmentation in autonomous driving scenarios. This dataset, recorded with a multi-modal sensor suite mounted on a car, provides high-quality, pixel-wise annotations of anomalous objects at both semantic and instance level. Our dataset contains 800 images, with more than 50 unknown classes, i.e., classes that do not appear in the training set, and 4000 object instances, making it an extremely challenging dataset for open-world segmentation tasks in the autonomous driving scenario. We provide competitions for multiple open-world tasks on a hidden test set. Our dataset and competitions are available at https://www.ipb.uni-bonn.de/data/panic. 
 
      
        Related papers
        - RemoteSAM: Towards Segment Anything for Earth Observation [29.707796048411705]
 We aim to develop a robust yet flexible visual foundation model for Earth observation.<n>It should possess strong capabilities in recognizing and localizing diverse visual targets.<n>We present RemoteSAM, a foundation model that establishes new SoTA on several earth observation perception benchmarks.
 arXiv  Detail & Related papers  (2025-05-23T15:27:57Z)
- Prior2Former -- Evidential Modeling of Mask Transformers for   Assumption-Free Open-World Panoptic Segmentation [74.55677741919035]
 We propose Prior2Former (P2F), the first approach for segmentation vision transformers rooted in evidential learning.<n>P2F extends the mask vision transformer architecture by incorporating a Beta prior for computing model uncertainty in pixel-wise binary mask assignments.<n>Unlike most segmentation models addressing unknown classes, P2F operates without access to OOD data samples or contrastive training on void (i.e., unlabeled) classes.
 arXiv  Detail & Related papers  (2025-04-07T08:53:14Z)
- Test-Time Optimization for Domain Adaptive Open Vocabulary Segmentation [15.941958367737408]
 Seg-TTO is a framework for zero-shot, open-vocabulary semantic segmentation.
We focus on segmentation-specific test-time optimization to address this gap.
Seg-TTO demonstrates clear performance improvements (up to 27% mIoU increase on some datasets) establishing new state-of-the-art.
 arXiv  Detail & Related papers  (2025-01-08T18:58:24Z)
- SOHES: Self-supervised Open-world Hierarchical Entity Segmentation [82.45303116125021]
 This work presents Self-supervised Open-world Hierarchical Entities (SOHES), a novel approach that eliminates the need for human annotations.
We produce abundant high-quality pseudo-labels through visual feature clustering, and rectify the noises in pseudo-labels via a teacher- mutual-learning procedure.
Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation.
 arXiv  Detail & Related papers  (2024-04-18T17:59:46Z)
- Open-World Semantic Segmentation Including Class Similarity [31.799000996671975]
 This paper tackles open-world semantic segmentation, i.e., the variant of interpreting image data in which objects occur that have not been seen during training.
We propose a novel approach that performs accurate closed-world semantic segmentation and can identify new categories without requiring any additional training data.
 arXiv  Detail & Related papers  (2024-03-12T11:11:19Z)
- Towards Universal Vision-language Omni-supervised Segmentation [72.31277932442988]
 We present Vision-Language Omni-Supervised (VLOSS) to treat open-world segmentation tasks as proposal classification.
We leverage omni-supervised data (i.e., panoptic segmentation data, object detection data, and image-text pairs data) into training, thus enriching the open-world segmentation ability.
With fewer parameters, our VLOSS with Swin-Tiny surpasses MaskCLIP by 2% in terms of mask AP on LVIS v1 dataset.
 arXiv  Detail & Related papers  (2023-03-12T02:57:53Z)
- Open-world Instance Segmentation: Top-down Learning with Bottom-up   Supervision [83.57156368908836]
 We propose a novel approach for open world instance segmentation called bottom-Up and top-Down Open-world (UDOS)
UDOS first predicts parts of objects using a top-down network trained with weak supervision from bottom-up segmentations.
UDOS enjoys both the speed and efficiency from the topdown architectures and the ability to unseen categories from bottom-up supervision.
 arXiv  Detail & Related papers  (2023-03-09T18:55:03Z)
- Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From
  Learned Pairwise Affinity [59.1823948436411]
 We propose a novel approach for mask proposals, Generic Grouping Networks (GGNs)
Our approach combines a local measure of pixel affinity with instance-level mask supervision, producing a training regimen designed to make the model as generic as the data diversity allows.
 arXiv  Detail & Related papers  (2022-04-12T22:37:49Z)
- Exemplar-Based Open-Set Panoptic Segmentation Network [79.99748041746592]
 We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation (OPS) task.
We investigate the practical challenges of the task and construct a benchmark on top of an existing dataset, COCO.
We propose a novel exemplar-based open-set panoptic segmentation network (EOPSN) inspired by exemplar theory.
 arXiv  Detail & Related papers  (2021-05-18T07:59:21Z)
- SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
 Deep neural networks (DNNs) are usually trained on a closed set of semantic classes.
They are ill-equipped to handle previously-unseen objects.
 detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
 arXiv  Detail & Related papers  (2021-04-30T07:58:19Z)
- Video Class Agnostic Segmentation Benchmark for Autonomous Driving [13.312978643938202]
 In certain safety-critical robotics applications, it is important to segment all objects, including those unknown at training time.
We formalize the task of video class segmentation from monocular video sequences in autonomous driving to account for unknown objects.
 arXiv  Detail & Related papers  (2021-03-19T20:41:40Z)
- SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and
  Benchmark [11.101588888002045]
 We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles.
We analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations.
 arXiv  Detail & Related papers  (2020-01-10T14:44:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.