Related papers: Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment

Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment

URL: http://arxiv.org/abs/2406.04949v1
Date: Fri, 7 Jun 2024 14:07:23 GMT
Title: Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment
Authors: Venkanna Babu Guthula, Stefan Oehmcke, Remigio Chilaule, Hui Zhang, Nico Lang, Ankit Kariryaa, Johan Mottelson, Christian Igel,
Abstract summary: We release the Nacala-Roof-Material dataset, which contains high-resolution drone images from Mozambique. The dataset defines a multi-task computer vision problem, comprising object detection, classification, and segmentation. We show that each of the methods has its advantages but none is superior on all tasks, which highlights the potential of our dataset for future research in multi-task learning.
Score: 6.858584004690944
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As low-quality housing and in particular certain roof characteristics are associated with an increased risk of malaria, classification of roof types based on remote sensing imagery can support the assessment of malaria risk and thereby help prevent the disease. To support research in this area, we release the Nacala-Roof-Material dataset, which contains high-resolution drone images from Mozambique with corresponding labels delineating houses and specifying their roof types. The dataset defines a multi-task computer vision problem, comprising object detection, classification, and segmentation. In addition, we benchmarked various state-of-the-art approaches on the dataset. Canonical U-Nets, YOLOv8, and a custom decoder on pretrained DINOv2 served as baselines. We show that each of the methods has its advantages but none is superior on all tasks, which highlights the potential of our dataset for future research in multi-task learning. While the tasks are closely related, accurate segmentation of objects does not necessarily imply accurate instance separation, and vice versa. We address this general issue by introducing a variant of the deep ordinal watershed (DOW) approach that additionally separates the interior of objects, allowing for improved object delineation and separation. We show that our DOW variant is a generic approach that improves the performance of both U-Net and DINOv2 backbones, leading to a better trade-off between semantic segmentation and instance segmentation.

Related papers

RemoteSAM: Towards Segment Anything for Earth Observation [29.707796048411705]
We aim to develop a robust yet flexible visual foundation model for Earth observation.<n>It should possess strong capabilities in recognizing and localizing diverse visual targets.<n>We present RemoteSAM, a foundation model that establishes new SoTA on several earth observation perception benchmarks.
arXiv Detail & Related papers (2025-05-23T15:27:57Z)
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection [52.490375806093745]
The objective of few-shot object detection (FSOD) is to detect novel objects with few training samples. We introduce the side information to alleviate the negative influences derived from the feature space and sample viewpoints. Our model outperforms the previous state-of-the-art methods, significantly improving the ability of FSOD in most shots/splits.
arXiv Detail & Related papers (2025-04-09T17:24:05Z)
A Dataset for Semantic Segmentation in the Presence of Unknowns [49.795683850385956]
Existing datasets allow evaluation of only knowns or unknowns - but not both. We propose a novel anomaly segmentation dataset, ISSU, that features a diverse set of anomaly inputs from cluttered real-world environments. The dataset is twice larger than existing anomaly segmentation datasets.
arXiv Detail & Related papers (2025-03-28T10:31:01Z)
Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z)
Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark [101.23684938489413]
Anomaly detection (AD) is often focused on detecting anomalies for industrial quality inspection and medical lesion examination. This work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field. Inspired by the metrics in the segmentation field, we propose several more practical threshold-dependent AD-specific metrics.
arXiv Detail & Related papers (2024-04-16T17:38:26Z)
SepVAE: a contrastive VAE to separate pathological patterns from healthy ones [2.619659560375341]
Contrastive Analysis VAE (CA-VAEs) is a family of Variational auto-encoders (VAEs) that aims at separating the common factors of variation between a background dataset (BG) and a target dataset (TG) We show a better performance than previous CA-VAEs methods on three medical applications and a natural images dataset (CelebA)
arXiv Detail & Related papers (2023-07-12T14:52:21Z)
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images. We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z)
INoD: Injected Noise Discriminator for Self-Supervised Representation Learning in Agricultural Fields [6.891600948991265]
We propose an Injected Noise Discriminator (INoD) which exploits principles of feature replacement and dataset discrimination for self-supervised representation learning. INoD interleaves feature maps from two disjoint datasets during their convolutional encoding and predicts the dataset affiliation of the resultant feature map as a pretext task. Our approach enables the network to learn unequivocal representations of objects seen in one dataset while observing them in conjunction with similar features from the disjoint dataset.
arXiv Detail & Related papers (2023-03-31T14:46:31Z)
Learning Confident Classifiers in the Presence of Label Noise [5.829762367794509]
This paper proposes a probabilistic model for noisy observations that allows us to build a confident classification and segmentation models. Our experiments show that our algorithm outperforms state-of-the-art solutions for the considered classification and segmentation problems.
arXiv Detail & Related papers (2023-01-02T04:27:25Z)
VAESim: A probabilistic approach for self-supervised prototype discovery [0.23624125155742057]
We propose an architecture for image stratification based on a conditional variational autoencoder. We use a continuous latent space to represent the continuum of disorders and find clusters during training, which can then be used for image/patient stratification. We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE.
arXiv Detail & Related papers (2022-09-25T17:55:31Z)
A Contrastive Distillation Approach for Incremental Semantic Segmentation in Aerial Images [15.75291664088815]
A major issue concerning current deep neural architectures is known as catastrophic forgetting. We propose a contrastive regularization, where any given input is compared with its augmented version. We show the effectiveness of our solution on the Potsdam dataset, outperforming the incremental baseline in every test.
arXiv Detail & Related papers (2021-12-07T16:44:45Z)
Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations. We propose an unsupervised approach to object part discovery and segmentation. Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z)
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations [78.12377360145078]
Contrastive self-supervised learning has outperformed supervised pretraining on many downstream tasks like segmentation and object detection. In this paper, we first study how biases in the dataset affect existing methods. We show that current contrastive approaches work surprisingly well across: (i) object- versus scene-centric, (ii) uniform versus long-tailed and (iii) general versus domain-specific datasets.
arXiv Detail & Related papers (2021-06-10T17:59:13Z)
Unsupervised Domain Adaption of Object Detectors: A Survey [87.08473838767235]
Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications. Learning highly accurate models relies on the availability of datasets with a large number of annotated images. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images.
arXiv Detail & Related papers (2021-05-27T23:34:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.