Related papers: GMT: Guided Mask Transformer for Leaf Instance Segmentation

GMT: Guided Mask Transformer for Leaf Instance Segmentation

URL: http://arxiv.org/abs/2406.17109v2
Date: Wed, 11 Sep 2024 14:32:51 GMT
Title: GMT: Guided Mask Transformer for Leaf Instance Segmentation
Authors: Feng Chen, Sotirios A. Tsaftaris, Mario Valerio Giuffrida,
Abstract summary: Leaf instance segmentation is a challenging task, aiming to separate and delineate each leaf in an image of a plant. We propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor. Our GMT consistently outperforms the state-of-the-art on three public plant datasets.
Score: 14.458970589296554
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Leaf instance segmentation is a challenging multi-instance segmentation task, aiming to separate and delineate each leaf in an image of a plant. Accurate segmentation of each leaf is crucial for plant-related applications such as the fine-grained monitoring of plant growth and crop yield estimation. This task is challenging because of the high similarity (in shape and colour), great size variation, and heavy occlusions among leaf instances. Furthermore, the typically small size of annotated leaf datasets makes it more difficult to learn the distinctive features needed for precise segmentation. We hypothesise that the key to overcoming the these challenges lies in the specific spatial patterns of leaf distribution. In this paper, we propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor. These spatial priors are embedded in a set of guide functions that map leaves at different positions into a more separable embedding space. Our GMT consistently outperforms the state-of-the-art on three public plant datasets.

Related papers

Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources. We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs. Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z)
Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers [55.475142494272724]
Time series prediction is crucial for understanding and forecasting complex dynamics in various domains. We introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions. The model consistently delivers state-of-the-art performance across various real-world datasets.
arXiv Detail & Related papers (2024-05-22T16:41:21Z)
Unsupervised Pre-Training for 3D Leaf Instance Segmentation [34.122575664767915]
This paper addresses the problem of reducing the labeling effort required to perform leaf instance segmentation on 3D point clouds. We propose a novel self-supervised task-specific pre-training approach to initialize the backbone of a network for leaf instance segmentation. We also introduce a novel automatic postprocessing that considers the difficulty of correctly segmenting the points close to the stem.
arXiv Detail & Related papers (2024-01-16T08:11:08Z)
HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation [113.6560373226501]
This work studies semantic segmentation under the domain generalization setting. We propose a novel hierarchical grouping transformer (HGFormer) to explicitly group pixels to form part-level masks and then whole-level masks. Experiments show that HGFormer yields more robust semantic segmentation results than per-pixel classification methods and flat grouping transformers.
arXiv Detail & Related papers (2023-05-22T13:33:41Z)
Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf Instance Segmentation in the Agricultural Domain [29.647846446064992]
Plant phenotyping is a central task in agriculture, as it describes plants' growth stage, development, and other relevant quantities. In this paper, we address the problem of joint semantic, plant instance, and leaf instance segmentation of crop fields from RGB data. We propose a single convolutional neural network that addresses the three tasks simultaneously, exploiting their underlying hierarchical structure.
arXiv Detail & Related papers (2022-10-14T15:01:08Z)
Statistical shape representations for temporal registration of plant components in 3D [5.349852254138086]
We demonstrate how using shape features improves temporal organ matching. This is essential for robotic crop monitoring, which enables whole-of-lifecycle phenotyping.
arXiv Detail & Related papers (2022-09-23T11:11:10Z)
CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation. We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration. The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z)
MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation. It simultaneously learns global semantic information and local spatial-detailed features. Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z)
GrowliFlower: An image time series dataset for GROWth analysis of cauLIFLOWER [2.8247971782279615]
This article presents GrowliFlower, an image-based UAV time series dataset of two monitored cauliflower fields of size 0.39 and 0.60 ha acquired in 2020 and 2021. The dataset contains RGB and multispectral orthophotos from which about 14,000 individual plant coordinates are derived and provided. The dataset contains collected phenotypic traits of 740 plants, including the developmental stage as well as plant and cauliflower size.
arXiv Detail & Related papers (2022-04-01T08:56:59Z)
Grasp-Oriented Fine-grained Cloth Segmentation without Real Supervision [66.56535902642085]
This paper tackles the problem of fine-grained region detection in deformed clothes using only a depth image. We define up to 6 semantic regions of varying extent, including edges on the neckline, sleeve cuffs, and hem, plus top and bottom grasping points. We introduce a U-net based network to segment and label these parts. We show that training our network solely with synthetic data and the proposed DA yields results competitive with models trained on real data.
arXiv Detail & Related papers (2021-10-06T16:31:20Z)
Semantic Attention and Scale Complementary Network for Instance Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB) SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map. SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z)
CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation. We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z)
Unsupervised Domain Adaptation For Plant Organ Counting [12.424350934766704]
Counting plant organs for image-based plant phenotyping falls within this category. In this paper, we propose a domain-adrial learning approach for domain adaptation of density map estimation.
arXiv Detail & Related papers (2020-09-02T13:57:09Z)
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.