GMT: Guided Mask Transformer for Leaf Instance Segmentation
- URL: http://arxiv.org/abs/2406.17109v1
- Date: Mon, 24 Jun 2024 19:52:27 GMT
- Title: GMT: Guided Mask Transformer for Leaf Instance Segmentation
- Authors: Feng Chen, Sotirios A. Tsaftaris, Mario Valerio Giuffrida,
- Abstract summary: Leaf instance segmentation is a challenging task, aiming to separate and delineate each leaf in an image of a plant.
We propose Guided Mask (GMT), which contains three key components, namely Guided Positional Transformer (GPE), Guided Embedding Fusion Module (GEFM) and Guided Dynamic Positional Queries (GDPQ)
The proposed GMT consistently outperforms State-of-the-Art models on three public plant datasets.
- Score: 14.458970589296554
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Leaf instance segmentation is a challenging multi-instance segmentation task, aiming to separate and delineate each leaf in an image of a plant. The delineation of each leaf is a necessary prerequisite task for several biology-related applications such as the fine-grained monitoring of plant growth, and crop yield estimation. The task is challenging because self-similarity of instances is high (similar shape and colour) and instances vary greatly in size under heavy occulusion. We believe that the key to overcoming the aforementioned challenges lies in the specific spatial patterns of leaf distribution. For example, leaves typically grow around the plant's center, with smaller leaves clustering and overlapped near this central point. In this paper, we propose a novel approach named Guided Mask Transformer (GMT), which contains three key components, namely Guided Positional Encoding (GPE), Guided Embedding Fusion Module (GEFM) and Guided Dynamic Positional Queries (GDPQ), to extend the meta-architecture of Mask2Former and incorporate with a set of harmonic guide functions. These guide functions are tailored to the pixel positions of instances and trained to separate distinct instances in an embedding space. The proposed GMT consistently outperforms State-of-the-Art models on three public plant datasets.
Related papers
- Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources.
We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs.
Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z) - Unsupervised Pre-Training for 3D Leaf Instance Segmentation [34.122575664767915]
This paper addresses the problem of reducing the labeling effort required to perform leaf instance segmentation on 3D point clouds.
We propose a novel self-supervised task-specific pre-training approach to initialize the backbone of a network for leaf instance segmentation.
We also introduce a novel automatic postprocessing that considers the difficulty of correctly segmenting the points close to the stem.
arXiv Detail & Related papers (2024-01-16T08:11:08Z) - ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple
yet General Complementary Transformer [91.43066633305662]
We propose a novel underlineComPlementary underlinetransformer, textbfComPtr, for diverse bi-source dense prediction tasks.
ComPtr treats different inputs equally and builds an efficient dense interaction model in the form of sequence-to-sequence on top of the transformer.
arXiv Detail & Related papers (2023-07-23T15:17:45Z) - Position-Guided Point Cloud Panoptic Segmentation Transformer [118.17651196656178]
This work begins by applying this appealing paradigm to LiDAR-based point cloud segmentation and obtains a simple yet effective baseline.
We observe that instances in the sparse point clouds are relatively small to the whole scene and often have similar geometry but lack distinctive appearance for segmentation, which are rare in the image domain.
The method, named Position-guided Point cloud Panoptic segmentation transFormer (P3Former), outperforms previous state-of-the-art methods by 3.4% and 1.2% on Semantic KITTI and nuScenes benchmark, respectively.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Hierarchical Approach for Joint Semantic, Plant Instance, and Leaf
Instance Segmentation in the Agricultural Domain [29.647846446064992]
Plant phenotyping is a central task in agriculture, as it describes plants' growth stage, development, and other relevant quantities.
In this paper, we address the problem of joint semantic, plant instance, and leaf instance segmentation of crop fields from RGB data.
We propose a single convolutional neural network that addresses the three tasks simultaneously, exploiting their underlying hierarchical structure.
arXiv Detail & Related papers (2022-10-14T15:01:08Z) - MulT: An End-to-End Multitask Learning Transformer [66.52419626048115]
We propose an end-to-end Multitask Learning Transformer framework, named MulT, to simultaneously learn multiple high-level vision tasks.
Our framework encodes the input image into a shared representation and makes predictions for each vision task using task-specific transformer-based decoder heads.
arXiv Detail & Related papers (2022-05-17T13:03:18Z) - Decoupled Multi-task Learning with Cyclical Self-Regulation for Face
Parsing [71.19528222206088]
We propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation for face parsing.
Specifically, DML-CSR designs a multi-task model which comprises face parsing, binary edge, and category edge detection.
Our method achieves the new state-of-the-art performance on the Helen, CelebA-HQ, and LapaMask datasets.
arXiv Detail & Related papers (2022-03-28T02:12:30Z) - LeafMask: Towards Greater Accuracy on Leaf Segmentation [1.0499611180329804]
LeafMask is a new end-to-end model to delineate each leaf region and count the number of leaves.
Our proposed model achieves the 90.09% BestDice score, outperforming other state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-08T04:57:18Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - RDCNet: Instance segmentation with a minimalist recurrent residual
network [0.14999444543328289]
We propose a minimalist recurrent network called recurrent dilated convolutional network (RDCNet)
RDCNet consists of a shared stacked dilated convolution (sSDC) layer that iteratively refines its output and thereby generates interpretable intermediate predictions.
We demonstrate its versatility on 3 tasks with different imaging modalities: nuclear segmentation of H&E slides, of 3D anisotropic stacks from light-sheet fluorescence microscopy and leaf segmentation of top-view images of plants.
arXiv Detail & Related papers (2020-10-02T13:36:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.