Panoptic Segmentation Meets Remote Sensing
- URL: http://arxiv.org/abs/2111.12126v1
- Date: Tue, 23 Nov 2021 19:48:55 GMT
- Title: Panoptic Segmentation Meets Remote Sensing
- Authors: Osmar Luiz Ferreira de Carvalho, Osmar Ab\'ilio de Carvalho J\'unior,
Cristiano Rosa e Silva, Anesmar Olino de Albuquerque, Nickolas Castro
Santana, Dibio Leandro Borges, Roberto Arnaldo Trancoso Gomes, Renato Fontes
Guimar\~aes
- Abstract summary: Panoptic segmentation combines instance and semantic predictions, allowing the detection of "things" and "stuff" simultaneously.
This study aims to solve and increase the operability of panoptic segmentation in remote sensing.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Panoptic segmentation combines instance and semantic predictions, allowing
the detection of "things" and "stuff" simultaneously. Effectively approaching
panoptic segmentation in remotely sensed data can be auspicious in many
challenging problems since it allows continuous mapping and specific target
counting. Several difficulties have prevented the growth of this task in remote
sensing: (a) most algorithms are designed for traditional images, (b) image
labelling must encompass "things" and "stuff" classes, and (c) the annotation
format is complex. Thus, aiming to solve and increase the operability of
panoptic segmentation in remote sensing, this study has five objectives: (1)
create a novel data preparation pipeline for panoptic segmentation, (2) propose
an annotation conversion software to generate panoptic annotations; (3) propose
a novel dataset on urban areas, (4) modify the Detectron2 for the task, and (5)
evaluate difficulties of this task in the urban setting. We used an aerial
image with a 0,24-meter spatial resolution considering 14 classes. Our pipeline
considers three image inputs, and the proposed software uses point shapefiles
for creating samples in the COCO format. Our study generated 3,400 samples with
512x512 pixel dimensions. We used the Panoptic-FPN with two backbones
(ResNet-50 and ResNet-101), and the model evaluation considered semantic
instance and panoptic metrics. We obtained 93.9, 47.7, and 64.9 for the mean
IoU, box AP, and PQ. Our study presents the first effective pipeline for
panoptic segmentation and an extensive database for other researchers to use
and deal with other data or related problems requiring a thorough scene
understanding.
Related papers
- Depth-aware Panoptic Segmentation [1.4170154234094008]
We present a novel CNN-based method for panoptic segmentation.
We propose a new depth-aware dice loss term which penalises the assignment of pixels to the same thing instance.
Experiments carried out on the Cityscapes dataset show that the proposed method reduces the number of objects that are erroneously merged into one thing instance.
arXiv Detail & Related papers (2024-03-21T08:06:49Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Highly Accurate Dichotomous Image Segmentation [139.79513044546]
A new task called dichotomous image segmentation (DIS) aims to segment highly accurate objects from natural images.
We collect the first large-scale dataset, DIS5K, which contains 5,470 high-resolution (e.g., 2K, 4K or larger) images.
We also introduce a simple intermediate supervision baseline (IS-Net) using both feature-level and mask-level guidance for DIS model training.
arXiv Detail & Related papers (2022-03-06T20:09:19Z) - Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning
for Generating a City-Scale Vehicle Dataset [0.0]
Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery.
In this paper, we propose a novel semi-supervised iterative learning approach using GIS software.
The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU)
arXiv Detail & Related papers (2021-11-23T19:42:12Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - PAENet: A Progressive Attention-Enhanced Network for 3D to 2D Retinal
Vessel Segmentation [0.0]
3D to 2D retinal vessel segmentation is a challenging problem in Optical Coherence Tomography Angiography ( OCTA) images.
We propose a Progressive Attention-Enhanced Network (PAENet) based on attention mechanisms to extract rich feature representation.
Our proposed algorithm achieves state-of-the-art performance compared with previous methods.
arXiv Detail & Related papers (2021-08-26T10:27:25Z) - Pairwise Relation Learning for Semi-supervised Gland Segmentation [90.45303394358493]
We propose a pairwise relation-based semi-supervised (PRS2) model for gland segmentation on histology images.
This model consists of a segmentation network (S-Net) and a pairwise relation network (PR-Net)
We evaluate our model against five recent methods on the GlaS dataset and three recent methods on the CRAG dataset.
arXiv Detail & Related papers (2020-08-06T15:02:38Z) - A Nearest Neighbor Network to Extract Digital Terrain Models from 3D
Point Clouds [1.6249267147413524]
We present an algorithm that operates on 3D-point clouds and estimates the underlying DTM for the scene using an end-to-end approach.
Our model learns neighborhood information and seamlessly integrates this with point-wise and block-wise global features.
arXiv Detail & Related papers (2020-05-21T15:54:55Z) - Unifying Training and Inference for Panoptic Segmentation [111.44758195510838]
We present an end-to-end network to bridge the gap between training and inference for panoptic segmentation.
Our system sets new records on the popular street scene dataset, Cityscapes, achieving 61.4 PQ with a ResNet-50 backbone.
Our network flexibly works with and without object mask cues, performing competitively under both settings.
arXiv Detail & Related papers (2020-01-14T18:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.