Related papers: Augmentation Invariance and Adaptive Sampling in Semantic Segmentation of Agricultural Aerial Images

Augmentation Invariance and Adaptive Sampling in Semantic Segmentation of Agricultural Aerial Images

URL: http://arxiv.org/abs/2204.07969v1
Date: Sun, 17 Apr 2022 10:19:07 GMT
Title: Augmentation Invariance and Adaptive Sampling in Semantic Segmentation of Agricultural Aerial Images
Authors: Antonio Tavera, Edoardo Arnaudo, Carlo Masone, Barbara Caputo
Abstract summary: We investigate the problem of Semantic for agricultural aerial imagery. The existing methods used for this task are designed without considering two characteristics of the aerial data. We propose a solution based on two ideas: (i) we use together a set of suitable augmentation and a consistency loss to guide the model to learn semantic representations that are invariant to the photometric and geometric shifts typical of the top-down perspective. With an extensive set of experiments conducted on the Agriculture-Vision dataset, we demonstrate that our proposed strategies improve the performance of the current state-of-the-art method.
Score: 16.101248613062292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we investigate the problem of Semantic Segmentation for agricultural aerial imagery. We observe that the existing methods used for this task are designed without considering two characteristics of the aerial data: (i) the top-down perspective implies that the model cannot rely on a fixed semantic structure of the scene, because the same scene may be experienced with different rotations of the sensor; (ii) there can be a strong imbalance in the distribution of semantic classes because the relevant objects of the scene may appear at extremely different scales (e.g., a field of crops and a small vehicle). We propose a solution to these problems based on two ideas: (i) we use together a set of suitable augmentation and a consistency loss to guide the model to learn semantic representations that are invariant to the photometric and geometric shifts typical of the top-down perspective (Augmentation Invariance); (ii) we use a sampling method (Adaptive Sampling) that selects the training images based on a measure of pixel-wise distribution of classes and actual network confidence. With an extensive set of experiments conducted on the Agriculture-Vision dataset, we demonstrate that our proposed strategies improve the performance of the current state-of-the-art method.

Related papers

Reducing Semantic Ambiguity In Domain Adaptive Semantic Segmentation Via Probabilistic Prototypical Pixel Contrast [7.092718945468069]
Domain adaptation aims to reduce the model degradation on the target domain caused by the domain shift between the source and target domains. Probabilistic proto-typical pixel contrast (PPPC) is a universal adaptation framework that models each pixel embedding as a probability. PPPC not only helps to address ambiguity at the pixel level, yielding discriminative representations but also significant improvements in both synthetic-to-real and day-to-night adaptation tasks.
arXiv Detail & Related papers (2024-09-27T08:25:03Z)
SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow [94.90853153808987]
Semantic segmentation and semantic image synthesis are representative tasks in visual perception and generation. We propose a unified framework (SemFlow) and model them as a pair of reverse problems. Experiments show that our SemFlow achieves competitive results on semantic segmentation and semantic image synthesis tasks.
arXiv Detail & Related papers (2024-05-30T17:34:40Z)
SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks. To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z)
Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations. We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z)
COSE: A Consistency-Sensitivity Metric for Saliency on Image Classification [21.3855970055692]
We present a set of metrics that utilize vision priors to assess the performance of saliency methods on image classification tasks. We show that although saliency methods are thought to be architecture-independent, most methods could better explain transformer-based models over convolutional-based models.
arXiv Detail & Related papers (2023-09-20T01:06:44Z)
Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain. With only a single representative text feature instead of real images, the synthesized images gradually lose diversity. We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z)
Domain Adaptation for Medical Image Segmentation using Transformation-Invariant Self-Training [7.738197566031678]
We propose a semi-supervised learning strategy for domain adaptation termed transformation-invariant self-training (TI-ST) The proposed method assesses pixel-wise pseudo-labels' reliability and filters out unreliable detections during self-training.
arXiv Detail & Related papers (2023-07-31T13:42:56Z)
Weakly supervised segmentation with cross-modality equivariant constraints [7.757293476741071]
Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation. We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs. Our approach outperforms relevant recent literature under the same learning conditions.
arXiv Detail & Related papers (2021-04-06T13:14:20Z)
Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications. We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z)
Semantic Change Detection with Asymmetric Siamese Networks [71.28665116793138]
Given two aerial images, semantic change detection aims to locate the land-cover variations and identify their change types with pixel-wise boundaries. This problem is vital in many earth vision related tasks, such as precise urban planning and natural resource management. We present an asymmetric siamese network (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures.
arXiv Detail & Related papers (2020-10-12T13:26:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.