Related papers: Zero-Shot Segmentation through Prototype-Guidance for Multi-Label Plant Species Identification

Zero-Shot Segmentation through Prototype-Guidance for Multi-Label Plant Species Identification

URL: http://arxiv.org/abs/2512.19957v1
Date: Tue, 23 Dec 2025 01:06:55 GMT
Title: Zero-Shot Segmentation through Prototype-Guidance for Multi-Label Plant Species Identification
Authors: Luciano Araujo Dourado Filho, Almir Moreira da Silva Neto, Rodrigo Pereira David, Rodrigo Tripodi Calumby,
Abstract summary: This paper presents an approach developed to address the PlantClef 2025 challenge, which consists of a fine-grained multi-label species identification.<n>Our solution focused on employing class prototypes obtained from the training dataset as a proxy guidance for training a segmentation Vision Transformer (ViT) on the test set images.<n>The proposed approach enabled a domain-adaptation from multi-class identification with individual species, into multi-label classification from high-resolution vegetation plots.
Score: 0.5249805590164902
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents an approach developed to address the PlantClef 2025 challenge, which consists of a fine-grained multi-label species identification, over high-resolution images. Our solution focused on employing class prototypes obtained from the training dataset as a proxy guidance for training a segmentation Vision Transformer (ViT) on the test set images. To obtain these representations, the proposed method extracts features from training dataset images and create clusters, by applying K-Means, with $K$ equals to the number of classes in the dataset. The segmentation model is a customized narrow ViT, built by replacing the patch embedding layer with a frozen DinoV2, pre-trained on the training dataset for individual species classification. This model is trained to reconstruct the class prototypes of the training dataset from the test dataset images. We then use this model to obtain attention scores that enable to identify and localize areas of interest and consequently guide the classification process. The proposed approach enabled a domain-adaptation from multi-class identification with individual species, into multi-label classification from high-resolution vegetation plots. Our method achieved fifth place in the PlantCLEF 2025 challenge on the private leaderboard, with an F1 score of 0.33331. Besides that, in absolute terms our method scored 0.03 lower than the top-performing submission, suggesting that it may achieved competitive performance in the benchmark task. Our code is available at \href{https://github.com/ADAM-UEFS/PlantCLEF2025}{https://github.com/ADAM-UEFS/PlantCLEF2025}.

Related papers

Transfer Learning and Mixup for Fine-Grained Few-Shot Fungi Classification [0.0]
This paper presents our approach for the FungiCLEF 2025 competition.<n>It focuses on few-shot fine-grained visual categorization using the FungiTastic Few-Shot dataset.
arXiv Detail & Related papers (2025-07-11T01:21:21Z)
Multi-Label Plant Species Classification with Self-Supervised Vision Transformers [0.0]
We present a transfer learning approach using a self-supervised Vision Transformer (DINOv2) for the PlantCLEF 2024 competition. To address the computational challenges of the large-scale dataset, we employ Spark for distributed data processing. Our results demonstrate the efficacy of combining transfer learning with advanced data processing techniques for multi-label image classification tasks.
arXiv Detail & Related papers (2024-07-08T18:07:33Z)
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification [49.36348058247138]
We tackle the problem of cross-domain few-shot classification by making a small proportion of unlabeled images in the target domain accessible in the training stage. We meticulously design a cross-level knowledge distillation method, which can strengthen the ability of the model to extract more discriminative features in the target dataset. Our approach can surpass the previous state-of-the-art method, Dynamic-Distillation, by 5.44% on 1-shot and 1.37% on 5-shot classification tasks.
arXiv Detail & Related papers (2023-11-04T12:28:04Z)
Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task. We propose a co-training-based framework that encourages clustering consistency. Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z)
Enhancing Self-Supervised Learning for Remote Sensing with Elevation Data: A Case Study with Scarce And High Level Semantic Labels [1.534667887016089]
This work proposes a hybrid unsupervised and supervised learning method to pre-train models applied in Earth observation downstream tasks. We combine a contrastive approach to pre-train models with a pixel-wise regression pre-text task to predict coarse elevation maps.
arXiv Detail & Related papers (2023-04-13T23:01:11Z)
High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training. Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss. We reformulate both techniques based on binomial posteriors of multiple independent binary problems. This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z)
CAFS: Class Adaptive Framework for Semi-Supervised Semantic Segmentation [5.484296906525601]
Semi-supervised semantic segmentation learns a model for classifying pixels into specific classes using a few labeled samples and numerous unlabeled images. We propose a class-adaptive semisupervision framework for semi-supervised semantic segmentation (CAFS) CAFS constructs a validation set on a labeled dataset, to leverage the calibration performance for each class.
arXiv Detail & Related papers (2023-03-21T05:56:53Z)
Cooperative Self-Training for Multi-Target Adaptive Semantic Segmentation [26.79776306494929]
We propose a self-training strategy that employs pseudo-labels to induce cooperation among multiple domain-specific classifiers. We employ feature stylization as an efficient way to generate image views that forms an integral part of self-training.
arXiv Detail & Related papers (2022-10-04T13:03:17Z)
Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS) It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes. In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image. We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z)
No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data. We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model. Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets. This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets. In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z)
Semi-Supervised Few-Shot Classification with Deep Invertible Hybrid Models [4.189643331553922]
We propose a deep invertible hybrid model which integrates discriminative and generative learning at a latent space level for semi-supervised few-shot classification. Our main originality lies in our integration of these components at a latent space level, which is effective in preventing overfitting.
arXiv Detail & Related papers (2021-05-22T05:55:16Z)
Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species. A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.