Aerial Imagery Pixel-level Segmentation
- URL: http://arxiv.org/abs/2012.02024v1
- Date: Thu, 3 Dec 2020 16:09:09 GMT
- Title: Aerial Imagery Pixel-level Segmentation
- Authors: Michael R. Heffels and Joaquin Vanschoren
- Abstract summary: We bridge the performance-gap between popular datasets and aerial imagery data.
Our work, using the state-of-the-art DeepLabv3+ Xception65 architecture, achieves a mean IOU of 70% on the DroneDeploy validation set.
- Score: 0.4079265319364249
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Aerial imagery can be used for important work on a global scale.
Nevertheless, the analysis of this data using neural network architectures lags
behind the current state-of-the-art on popular datasets such as PASCAL VOC,
CityScapes and Camvid. In this paper we bridge the performance-gap between
these popular datasets and aerial imagery data. Little work is done on aerial
imagery with state-of-the-art neural network architectures in a multi-class
setting. Our experiments concerning data augmentation, normalisation, image
size and loss functions give insight into a high performance setup for aerial
imagery segmentation datasets. Our work, using the state-of-the-art DeepLabv3+
Xception65 architecture, achieves a mean IOU of 70% on the DroneDeploy
validation set. With this result, we clearly outperform the current publicly
available state-of-the-art validation set mIOU (65%) performance with 5%.
Furthermore, to our knowledge, there is no mIOU benchmark for the test set.
Hence, we also propose a new benchmark on the DroneDeploy test set using the
best performing DeepLabv3+ Xception65 architecture, with a mIOU score of 52.5%.
Related papers
- From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection
with Super Resolution [4.107182710549721]
We present an innovative approach that combines super-resolution and an adapted lightweight YOLOv5 architecture.
Our experimental results demonstrate the model's superior performance in detecting small and densely clustered objects.
arXiv Detail & Related papers (2024-01-26T05:50:58Z) - Raising the Bar of AI-generated Image Detection with CLIP [50.345365081177555]
The aim of this work is to explore the potential of pre-trained vision-language models (VLMs) for universal detection of AI-generated images.
We develop a lightweight detection strategy based on CLIP features and study its performance in a wide variety of challenging scenarios.
arXiv Detail & Related papers (2023-11-30T21:11:20Z) - Memory-Efficient Graph Convolutional Networks for Object Classification
and Detection with Event Cameras [2.3311605203774395]
Graph convolutional networks (GCNs) are a promising approach for analyzing event data.
In this paper, we consider both factors together in order to achieve satisfying results and relatively low model complexity.
Our results show a 450-fold reduction in the number of parameters for the feature extraction module and a 4.5-fold reduction in the size of the data representation.
arXiv Detail & Related papers (2023-07-26T11:44:44Z) - Efficient Deduplication and Leakage Detection in Large Scale Image
Datasets with a focus on the CrowdAI Mapping Challenge Dataset [5.149242555705579]
We propose a drop-in pipeline that employs perceptual hashing techniques for efficient de-duplication of the dataset.
In our experiments, we demonstrate that nearly 250k($ sim $90%) images in the training split were identical.
Our analysis on the validation split demonstrates that roughly 56k of the 60k images also appear in the training split, resulting in a data leakage of 93%.
arXiv Detail & Related papers (2023-04-05T08:36:17Z) - BuildSeg: A General Framework for the Segmentation of Buildings [19.296282254565885]
Building segmentation from aerial images and 3D laser scanning (LiDAR) is a challenging task due to the diversity of backgrounds, building textures, and image quality.
We propose a general framework termed emphBuildSeg employing a generic approach that can be quickly applied to segment buildings.
arXiv Detail & Related papers (2023-01-15T21:09:00Z) - {\mu}Split: efficient image decomposition for microscopy data [50.794670705085835]
muSplit is a dedicated approach for trained image decomposition in the context of fluorescence microscopy images.
We introduce lateral contextualization (LC), a novel meta-architecture that enables the memory efficient incorporation of large image-context.
We apply muSplit to five decomposition tasks, one on a synthetic dataset, four others derived from real microscopy data.
arXiv Detail & Related papers (2022-11-23T11:26:24Z) - Inverse Image Frequency for Long-tailed Image Recognition [59.40098825416675]
We propose a novel de-biasing method named Inverse Image Frequency (IIF)
IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network.
Our experiments show that IIF surpasses the state of the art on many long-tailed benchmarks.
arXiv Detail & Related papers (2022-09-11T13:31:43Z) - Simple and Effective Synthesis of Indoor 3D Scenes [78.95697556834536]
We study the problem of immersive 3D indoor scenes from one or more images.
Our aim is to generate high-resolution images and videos from novel viewpoints.
We propose an image-to-image GAN that maps directly from reprojections of incomplete point clouds to full high-resolution RGB-D images.
arXiv Detail & Related papers (2022-04-06T17:54:46Z) - A lightweight and accurate YOLO-like network for small target detection
in Aerial Imagery [94.78943497436492]
We present YOLO-S, a simple, fast and efficient network for small target detection.
YOLO-S exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation.
YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.
arXiv Detail & Related papers (2022-04-05T16:29:49Z) - Deep ensembles based on Stochastic Activation Selection for Polyp
Segmentation [82.61182037130406]
This work deals with medical image segmentation and in particular with accurate polyp detection and segmentation during colonoscopy examinations.
Basic architecture in image segmentation consists of an encoder and a decoder.
We compare some variant of the DeepLab architecture obtained by varying the decoder backbone.
arXiv Detail & Related papers (2021-04-02T02:07:37Z) - RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization [20.350871370274238]
We study an important, yet largely unexplored problem of large-scale cross-modal visual localization.
We introduce a new dataset containing over 550K pairs of RGB and aerial LIDAR depth images.
We propose a novel joint embedding based method that effectively combines the appearance and semantic cues from both modalities.
arXiv Detail & Related papers (2020-09-12T01:18:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.