Related papers: Zero-shot Hierarchical Plant Segmentation via Foundation Segmentation Models and Text-to-image Attention

Zero-shot Hierarchical Plant Segmentation via Foundation Segmentation Models and Text-to-image Attention

URL: http://arxiv.org/abs/2509.09116v2
Date: Tue, 16 Sep 2025 10:47:41 GMT
Title: Zero-shot Hierarchical Plant Segmentation via Foundation Segmentation Models and Text-to-image Attention
Authors: Junhao Xing, Ryohei Miyakawa, Yang Yang, Xinpeng Liu, Risa Shinoda, Hiroaki Santo, Yosuke Toda, Fumio Okura,
Abstract summary: Foundation segmentation models achieve reasonable leaf instance extraction from top-view crop images without training.<n>We introduce ZeroPlantSeg, a zero-shot segmentation for rosette-shaped plant individuals from top-view images.
Score: 19.2882360692347
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Foundation segmentation models achieve reasonable leaf instance extraction from top-view crop images without training (i.e., zero-shot). However, segmenting entire plant individuals with each consisting of multiple overlapping leaves remains challenging. This problem is referred to as a hierarchical segmentation task, typically requiring annotated training datasets, which are often species-specific and require notable human labor. To address this, we introduce ZeroPlantSeg, a zero-shot segmentation for rosette-shaped plant individuals from top-view images. We integrate a foundation segmentation model, extracting leaf instances, and a vision-language model, reasoning about plants' structures to extract plant individuals without additional training. Evaluations on datasets with multiple plant species, growth stages, and shooting environments demonstrate that our method surpasses existing zero-shot methods and achieves better cross-domain performance than supervised methods. Implementations are available at https://github.com/JunhaoXing/ZeroPlantSeg.

Related papers

Unlocking Zero-Shot Plant Segmentation with Pl@ntNet Intelligence [3.7603674895765766]
We present a zero-shot segmentation approach for agricultural imagery.<n>Our method exploits Plantnet's specialized plant representations to identify plant regions.<n>We show consistent performance gains when using Plantnet-fine-tuned DinoV2 over the base DinoV2 model.
arXiv Detail & Related papers (2025-10-14T14:38:32Z)
UnSeg: One Universal Unlearnable Example Generator is Enough against All Image Segmentation [64.01742988773745]
An increasing privacy concern exists regarding training large-scale image segmentation models on unauthorized private data. We exploit the concept of unlearnable examples to make images unusable to model training by generating and adding unlearnable noise into the original images. We empirically verify the effectiveness of UnSeg across 6 mainstream image segmentation tasks, 10 widely used datasets, and 7 different network architectures.
arXiv Detail & Related papers (2024-10-13T16:34:46Z)
GMT: Guided Mask Transformer for Leaf Instance Segmentation [14.458970589296554]
Leaf instance segmentation is a challenging task, aiming to separate and delineate each leaf in an image of a plant.<n>We propose the Guided Mask Transformer (GMT), which leverages and integrates leaf spatial distribution priors into a Transformer-based segmentor.<n>Our GMT consistently outperforms the state-of-the-art on three public plant datasets.
arXiv Detail & Related papers (2024-06-24T19:52:27Z)
Diffusion-based Data Augmentation for Nuclei Image Segmentation [68.28350341833526]
We introduce the first diffusion-based augmentation method for nuclei segmentation. The idea is to synthesize a large number of labeled images to facilitate training the segmentation model. The experimental results show that by augmenting 10% labeled real dataset with synthetic samples, one can achieve comparable segmentation results.
arXiv Detail & Related papers (2023-10-22T06:16:16Z)
Improving Data Efficiency for Plant Cover Prediction with Label Interpolation and Monte-Carlo Cropping [7.993547048820065]
The plant community composition is an essential indicator of environmental changes and is usually analyzed in ecological field studies. We introduce an approach to interpolate the sparse labels in the collected vegetation plot time series down to the intermediate dense and unlabeled images. We also introduce a new method we call Monte-Carlo Cropping to deal with high-resolution images efficiently.
arXiv Detail & Related papers (2023-07-17T15:17:39Z)
Which Pixel to Annotate: a Label-Efficient Nuclei Segmentation Framework [70.18084425770091]
Deep neural networks have been widely applied in nuclei instance segmentation of H&E stained pathology images. It is inefficient and unnecessary to label all pixels for a dataset of nuclei images which usually contain similar and redundant patterns. We propose a novel full nuclei segmentation framework that chooses only a few image patches to be annotated, augments the training set from the selected samples, and achieves nuclei segmentation in a semi-supervised manner.
arXiv Detail & Related papers (2022-12-20T14:53:26Z)
Self-Supervised Leaf Segmentation under Complex Lighting Conditions [14.290827361756108]
Leaf segmentation is an essential prerequisite task in image-based plant phenotyping. We present a self-supervised leaf segmentation framework consisting of a self-supervised semantic segmentation model, a color-based leaf segmentation algorithm, and a self-supervised color correction model. Experimental results on datasets of different plant species demonstrate the potential of the proposed self-supervised framework.
arXiv Detail & Related papers (2022-03-29T22:59:02Z)
SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples. Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype. This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z)
Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation. We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths. In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z)
Unsupervised Domain Adaptation For Plant Organ Counting [12.424350934766704]
Counting plant organs for image-based plant phenotyping falls within this category. In this paper, we propose a domain-adrial learning approach for domain adaptation of density map estimation.
arXiv Detail & Related papers (2020-09-02T13:57:09Z)
Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species. A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.