Related papers: EnGraf-Net: Multiple Granularity Branch Network with Fine-Coarse Graft Grained for Classification Task

EnGraf-Net: Multiple Granularity Branch Network with Fine-Coarse Graft Grained for Classification Task

URL: http://arxiv.org/abs/2509.21061v1
Date: Thu, 25 Sep 2025 12:11:42 GMT
Title: EnGraf-Net: Multiple Granularity Branch Network with Fine-Coarse Graft Grained for Classification Task
Authors: Riccardo La Grassa, Ignazio Gallo, Nicola Landro,
Abstract summary: Fine-grained classification models are designed to focus on the relevant details necessary to distinguish highly similar classes.<n>Part-based approaches, including automatic cropping methods, suffer from an incomplete representation of local features.<n>We leverage semantic associations structured as a hierarchy (taxonomy) as supervised signals within an end-to-end deep neural network model, termed EnGraf-Net.
Score: 0.8299692647308321
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-grained classification models are designed to focus on the relevant details necessary to distinguish highly similar classes, particularly when intra-class variance is high and inter-class variance is low. Most existing models rely on part annotations such as bounding boxes, part locations, or textual attributes to enhance classification performance, while others employ sophisticated techniques to automatically extract attention maps. We posit that part-based approaches, including automatic cropping methods, suffer from an incomplete representation of local features, which are fundamental for distinguishing similar objects. While fine-grained classification aims to recognize the leaves of a hierarchical structure, humans recognize objects by also forming semantic associations. In this paper, we leverage semantic associations structured as a hierarchy (taxonomy) as supervised signals within an end-to-end deep neural network model, termed EnGraf-Net. Extensive experiments on three well-known datasets CIFAR-100, CUB-200-2011, and FGVC-Aircraft demonstrate the superiority of EnGraf-Net over many existing fine-grained models, showing competitive performance with the most recent state-of-the-art approaches, without requiring cropping techniques or manual annotations.

Related papers

FGDCC: Fine-Grained Deep Cluster Categorization -- A Framework for Intra-Class Variability Problems in Plant Classification [0.6445605125467574]
This paper proposes a novel method that aims at leveraging classification performance in Fine-Grained Visual Categorization tasks.<n>Our goal is to apply clustering over each class individually, which can allow to discover pseudo-labels that encodes a latent degree of similarity between images.<n>Our method still achieves state-of-the-art performance on the PlantNet300k dataset even though some of its components haven't been shown to be fully optimized.
arXiv Detail & Related papers (2025-12-23T01:14:06Z)
Hierarchical Representation Matching for CLIP-based Class-Incremental Learning [80.2317078787969]
Class-Incremental Learning (CIL) aims to endow models with the ability to continuously adapt to evolving data streams.<n>Recent advances in pre-trained vision-language models (e.g., CLIP) provide a powerful foundation for this task.<n>We introduce HiErarchical Representation MAtchiNg (HERMAN) for CLIP-based CIL.
arXiv Detail & Related papers (2025-09-26T17:59:51Z)
Just Say the Word: Annotation-Free Fine-Grained Object Counting [22.31750687552324]
Fine-grained object counting remains a major challenge for class-agnostic counting models.<n>We propose an alternative paradigm: Given a category name, tune a compact concept embedding from the prompt using synthetic images and pseudo-labels.<n>This embedding conditions a specialization module that refines raw overcounts from any frozen counter into accurate, category-specific estimates.
arXiv Detail & Related papers (2025-04-16T02:05:47Z)
Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task. We propose a co-training-based framework that encourages clustering consistency. Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z)
Fine-Grained Visual Classification using Self Assessment Classifier [12.596520707449027]
Extracting discriminative features plays a crucial role in the fine-grained visual classification task. In this paper, we introduce a Self Assessment, which simultaneously leverages the representation of the image and top-k prediction classes. We show that our method achieves new state-of-the-art results on CUB200-2011, Stanford Dog, and FGVC Aircraft datasets.
arXiv Detail & Related papers (2022-05-21T07:41:27Z)
Semantic Representation and Dependency Learning for Multi-Label Image Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category. Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model. We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z)
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption [72.35532598131176]
We propose SCARF, a technique for contrastive learning, where views are formed by corrupting a random subset of features. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders.
arXiv Detail & Related papers (2021-06-29T08:08:33Z)
GAN for Vision, KG for Relation: a Two-stage Deep Network for Zero-shot Action Recognition [33.23662792742078]
We propose a two-stage deep neural network for zero-shot action recognition. In the sampling stage, we utilize a generative adversarial networks (GAN) trained by action features and word vectors of seen classes. In the classification stage, we construct a knowledge graph based on the relationship between word vectors of action classes and related objects.
arXiv Detail & Related papers (2021-05-25T09:34:42Z)
Attribute Propagation Network for Graph Zero-shot Learning [57.68486382473194]
We introduce the attribute propagation network (APNet), which is composed of 1) a graph propagation model generating attribute vector for each class and 2) a parameterized nearest neighbor (NN) classifier. APNet achieves either compelling performance or new state-of-the-art results in experiments with two zero-shot learning settings and five benchmark datasets.
arXiv Detail & Related papers (2020-09-24T16:53:40Z)
Fine-Grained Visual Classification with Efficient End-to-end Localization [49.9887676289364]
We present an efficient localization module that can be fused with a classification network in an end-to-end setup. We evaluate the new model on the three benchmark datasets CUB200-2011, Stanford Cars and FGVC-Aircraft.
arXiv Detail & Related papers (2020-05-11T14:07:06Z)
Group Based Deep Shared Feature Learning for Fine-grained Image Classification [31.84610555517329]
We present a new deep network architecture that explicitly models shared features and removes their effect to achieve enhanced classification results. We call this framework Group based deep Shared Feature Learning (GSFL) and the resulting learned network as GSFL-Net. A key benefit of our specialized autoencoder is that it is versatile and can be combined with state-of-the-art fine-grained feature extraction models and trained together with them to improve their performance directly.
arXiv Detail & Related papers (2020-04-04T00:01:11Z)
Weakly Supervised Attention Pyramid Convolutional Neural Network for Fine-Grained Visual Classification [71.96618723152487]
We introduce Attention Pyramid Convolutional Neural Network (AP-CNN) AP-CNN learns both high-level semantic and low-level detailed feature representation. It can be trained end-to-end, without the need of additional bounding box/part annotations.
arXiv Detail & Related papers (2020-02-09T12:33:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.