Intra-Batch Supervision for Panoptic Segmentation on High-Resolution
Images
- URL: http://arxiv.org/abs/2304.08222v1
- Date: Mon, 17 Apr 2023 12:48:36 GMT
- Title: Intra-Batch Supervision for Panoptic Segmentation on High-Resolution
Images
- Authors: Daan de Geus, Gijs Dubbelman
- Abstract summary: Unified panoptic segmentation methods are achieving state-of-the-art results on several datasets.
To achieve these results on high-resolution datasets, these methods apply crop-based training.
We find that, although crop-based training is advantageous in general, it also has a harmful side-effect.
We propose Intra-Batch Supervision (IBS), which improves a network's ability to discriminate between instances.
- Score: 4.314956204483074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unified panoptic segmentation methods are achieving state-of-the-art results
on several datasets. To achieve these results on high-resolution datasets,
these methods apply crop-based training. In this work, we find that, although
crop-based training is advantageous in general, it also has a harmful
side-effect. Specifically, it limits the ability of unified networks to
discriminate between large object instances, causing them to make predictions
that are confused between multiple instances. To solve this, we propose
Intra-Batch Supervision (IBS), which improves a network's ability to
discriminate between instances by introducing additional supervision using
multiple images from the same batch. We show that, with our IBS, we
successfully address the confusion problem and consistently improve the
performance of unified networks. For the high-resolution Cityscapes and
Mapillary Vistas datasets, we achieve improvements of up to +2.5 on the
Panoptic Quality for thing classes, and even more considerable gains of up to
+5.8 on both the pixel accuracy and pixel precision, which we identify as
better metrics to capture the confusion problem.
Related papers
- Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Scale-aware Super-resolution Network with Dual Affinity Learning for
Lesion Segmentation from Medical Images [50.76668288066681]
We present a scale-aware super-resolution network to adaptively segment lesions of various sizes from low-resolution medical images.
Our proposed network achieved consistent improvements compared to other state-of-the-art methods.
arXiv Detail & Related papers (2023-05-30T14:25:55Z) - Improving Pixel-Level Contrastive Learning by Leveraging Exogenous Depth
Information [7.561849435043042]
Self-supervised representation learning based on Contrastive Learning (CL) has been the subject of much attention in recent years.
In this paper we will focus on the depth information, which can be obtained by using a depth network or measured from available data.
We show that using this estimation information in the contrastive loss leads to improved results and that the learned representations better follow the shapes of objects.
arXiv Detail & Related papers (2022-11-18T11:45:39Z) - Deep Semantic Statistics Matching (D2SM) Denoising Network [70.01091467628068]
We introduce the Deep Semantic Statistics Matching (D2SM) Denoising Network.
It exploits semantic features of pretrained classification networks, then it implicitly matches the probabilistic distribution of clear images at the semantic feature space.
By learning to preserve the semantic distribution of denoised images, we empirically find our method significantly improves the denoising capabilities of networks.
arXiv Detail & Related papers (2022-07-19T14:35:42Z) - Semantically Accurate Super-Resolution Generative Adversarial Networks [2.0454959820861727]
We propose a novel architecture and domain-specific feature loss to increase the performance of semantic segmentation.
We show the proposed approach improves perceived image quality as well as quantitative segmentation accuracy across all prediction classes.
This work demonstrates that jointly considering image-based and task-specific losses can improve the performance of both, and advances the state-of-the-art in semantic-aware super-resolution of aerial imagery.
arXiv Detail & Related papers (2022-05-17T23:05:27Z) - Solar Potential Assessment using Multi-Class Buildings Segmentation from
Aerial Images [3.180674374101366]
We exploit the power of fully convolutional neural networks for an instance segmentation task using extra added classes to the output.
We also show that CutMix mixed data augmentations and the One-Cycle learning rate policy are greater regularization methods to achieve a better fit on the training data.
arXiv Detail & Related papers (2021-11-22T18:16:07Z) - Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly
Supervised Semantic Segmentation [16.560870740946275]
Explicit Pseudo-pixel Supervision (EPS) learns from pixel-level feedback by combining two weak supervisions.
We devise a joint training strategy to fully utilize the complementary relationship between both information.
Our method can obtain accurate object boundaries and discard co-occurring pixels, thereby significantly improving the quality of pseudo-masks.
arXiv Detail & Related papers (2021-05-19T07:31:11Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Deep Semantic Matching with Foreground Detection and Cycle-Consistency [103.22976097225457]
We address weakly supervised semantic matching based on a deep network.
We explicitly estimate the foreground regions to suppress the effect of background clutter.
We develop cycle-consistent losses to enforce the predicted transformations across multiple images to be geometrically plausible and consistent.
arXiv Detail & Related papers (2020-03-31T22:38:09Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.