GMT: Guided Mask Transformer for Leaf Instance Segmentation
- URL: http://arxiv.org/abs/2406.17109v2
- Date: Wed, 11 Sep 2024 14:32:51 GMT
- Title: GMT: Guided Mask Transformer for Leaf Instance Segmentation
- Authors: Feng Chen, Sotirios A. Tsaftaris, Mario Valerio Giuffrida,
- Abstract summary: Leaf instance segmentation is a challenging task, aiming to separate and delineate each leaf in an image of a plant.
We propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor.
Our GMT consistently outperforms the state-of-the-art on three public plant datasets.
- Score: 14.458970589296554
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Leaf instance segmentation is a challenging multi-instance segmentation task, aiming to separate and delineate each leaf in an image of a plant. Accurate segmentation of each leaf is crucial for plant-related applications such as the fine-grained monitoring of plant growth and crop yield estimation. This task is challenging because of the high similarity (in shape and colour), great size variation, and heavy occlusions among leaf instances. Furthermore, the typically small size of annotated leaf datasets makes it more difficult to learn the distinctive features needed for precise segmentation. We hypothesise that the key to overcoming the these challenges lies in the specific spatial patterns of leaf distribution. In this paper, we propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor. These spatial priors are embedded in a set of guide functions that map leaves at different positions into a more separable embedding space. Our GMT consistently outperforms the state-of-the-art on three public plant datasets.
Related papers
- Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformers [55.475142494272724]
Time series prediction is crucial for understanding and forecasting complex dynamics in various domains.
We introduce GridTST, a model that combines the benefits of two approaches using innovative multi-directional attentions.
The model consistently delivers state-of-the-art performance across various real-world datasets.
arXiv Detail & Related papers (2024-05-22T16:41:21Z) - Unsupervised Pre-Training for 3D Leaf Instance Segmentation [34.122575664767915]
This paper addresses the problem of reducing the labeling effort required to perform leaf instance segmentation on 3D point clouds.
We propose a novel self-supervised task-specific pre-training approach to initialize the backbone of a network for leaf instance segmentation.
We also introduce a novel automatic postprocessing that considers the difficulty of correctly segmenting the points close to the stem.
arXiv Detail & Related papers (2024-01-16T08:11:08Z) - HGFormer: Hierarchical Grouping Transformer for Domain Generalized
Semantic Segmentation [113.6560373226501]
This work studies semantic segmentation under the domain generalization setting.
We propose a novel hierarchical grouping transformer (HGFormer) to explicitly group pixels to form part-level masks and then whole-level masks.
Experiments show that HGFormer yields more robust semantic segmentation results than per-pixel classification methods and flat grouping transformers.
arXiv Detail & Related papers (2023-05-22T13:33:41Z) - Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf
Instance Segmentation in the Agricultural Domain [29.647846446064992]
Plant phenotyping is a central task in agriculture, as it describes plants' growth stage, development, and other relevant quantities.
In this paper, we address the problem of joint semantic, plant instance, and leaf instance segmentation of crop fields from RGB data.
We propose a single convolutional neural network that addresses the three tasks simultaneously, exploiting their underlying hierarchical structure.
arXiv Detail & Related papers (2022-10-14T15:01:08Z) - Statistical shape representations for temporal registration of plant
components in 3D [5.349852254138086]
We demonstrate how using shape features improves temporal organ matching.
This is essential for robotic crop monitoring, which enables whole-of-lifecycle phenotyping.
arXiv Detail & Related papers (2022-09-23T11:11:10Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - GrowliFlower: An image time series dataset for GROWth analysis of
cauLIFLOWER [2.8247971782279615]
This article presents GrowliFlower, an image-based UAV time series dataset of two monitored cauliflower fields of size 0.39 and 0.60 ha acquired in 2020 and 2021.
The dataset contains RGB and multispectral orthophotos from which about 14,000 individual plant coordinates are derived and provided.
The dataset contains collected phenotypic traits of 740 plants, including the developmental stage as well as plant and cauliflower size.
arXiv Detail & Related papers (2022-04-01T08:56:59Z) - CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image
Segmentation [95.51455777713092]
Convolutional neural networks (CNNs) have been the de facto standard for nowadays 3D medical image segmentation.
We propose a novel framework that efficiently bridges a bf Convolutional neural network and a bf Transformer bf (CoTr) for accurate 3D medical image segmentation.
arXiv Detail & Related papers (2021-03-04T13:34:22Z) - Unsupervised Domain Adaptation For Plant Organ Counting [12.424350934766704]
Counting plant organs for image-based plant phenotyping falls within this category.
In this paper, we propose a domain-adrial learning approach for domain adaptation of density map estimation.
arXiv Detail & Related papers (2020-09-02T13:57:09Z) - Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian.
The graph reasoning is directly performed in the original feature space organized as a spatial pyramid.
We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.