Related papers: Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU

Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU

URL: http://arxiv.org/abs/2012.07489v2
Date: Thu, 8 Apr 2021 16:38:34 GMT
Title: Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU
Authors: Shipra Jain, Danda Paudel Pani, Martin Danelljan, Luc Van Gool
Abstract summary: We propose a novel training methodology to train and scale the existing semantic segmentation models. We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
Score: 87.48110331544885
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The state-of-the-art object detection and image classification methods can perform impressively on more than 9k and 10k classes, respectively. In contrast, the number of classes in semantic segmentation datasets is relatively limited. This is not surprising when the restrictions caused by the lack of labeled data and high computation demand for segmentation are considered. In this paper, we propose a novel training methodology to train and scale the existing semantic segmentation models for a large number of semantic classes without increasing the memory overhead. In our embedding-based scalable segmentation approach, we reduce the space complexity of the segmentation model's output from O(C) to O(1), propose an approximation method for ground-truth class probability, and use it to compute cross-entropy loss. The proposed approach is general and can be adopted by any state-of-the-art segmentation model to gracefully scale it for any number of semantic classes with only one GPU. Our approach achieves similar, and in some cases, even better mIoU for Cityscapes, Pascal VOC, ADE20k, COCO-Stuff10k datasets when adopted to DeeplabV3+ model with different backbones. We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.

Related papers

Scalable Segmentation for Ultra-High-Resolution Brain MR Images [9.295998760042169]
We propose a novel framework that leverages easily accessible, low-resolution coarse labels as spatial references and guidance.<n>Our approach regresses per-class signed distance transform maps, enabling smooth, boundary-aware supervision.<n>We validate our method through comprehensive experiments on both synthetic and real-world datasets.
arXiv Detail & Related papers (2025-05-27T19:34:55Z)
Training-Free Dataset Pruning for Instance Segmentation [35.124251909622025]
Instance segmentation presents three key challenges: pixel-level annotations, instance area variations, and class imbalances. We propose a novel Training-Free dataset Pruning (TFDP) method for instance segmentation. We achieve state-of-the-art results on VOC 2012, Cityscapes, and COCO datasets, generalizing well across CNN and Transformer architectures.
arXiv Detail & Related papers (2025-03-02T10:05:59Z)
Lightweight Uncertainty Quantification with Simplex Semantic Segmentation for Terrain Traversability [12.765558639563649]
We propose a simple, light-weight module that can be connected to any pretrained image segmentation model. Our module is based on maximum separation of the segmentation classes by respective prototype vectors. We demonstrate the effectiveness of our module for terrain segmentation.
arXiv Detail & Related papers (2024-07-18T11:00:49Z)
LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image Segmentation [2.0901574458380403]
We propose a new lightweight but efficient model, namely LiteNeXt, for medical image segmentation. LiteNeXt is trained from scratch with small amount of parameters (0.71M) and Giga Floating Point Operations Per Second (0.42).
arXiv Detail & Related papers (2024-04-04T01:59:19Z)
Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS) We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z)
Placing Objects in Context via Inpainting for Out-of-distribution Segmentation [59.00092709848619]
Placing Objects in Context (POC) is a pipeline to realistically add objects to an image. POC can be used to extend any dataset with an arbitrary number of objects. We present different anomaly segmentation datasets based on POC-generated data and show that POC can improve the performance of recent state-of-the-art anomaly fine-tuning methods.
arXiv Detail & Related papers (2024-02-26T08:32:41Z)
Interclass Prototype Relation for Few-Shot Segmentation [0.0]
With few-shot segmentation, the target class data distribution in the feature space is sparse and has low coverage because of the slight variations in the sample data. This study proposes the Interclass Prototype Relation Network (IPRNet) which improves the separation performance by reducing the similarity between other classes.
arXiv Detail & Related papers (2022-11-16T05:27:52Z)
Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z)
Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes. Our framework yields compelling results over several datasets. We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z)
Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting. This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class. The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z)
EOLO: Embedded Object Segmentation only Look Once [0.0]
We introduce an anchor-free and single-shot instance segmentation method, which is conceptually simple with 3 independent branches, fully convolutional and can be used by easily embedding it into mobile and embedded devices. Our method, refer as EOLO, reformulates the instance segmentation problem as predicting semantic segmentation and distinguishing overlapping objects problem, through instance center classification and 4D distance regression on each pixel. Without any bells and whistles, EOLO achieves 27.7$%$ in mask mAP under IoU50 and reaches 30 FPS on 1080Ti GPU, with a single-model and single-scale training/testing on
arXiv Detail & Related papers (2020-03-31T21:22:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.