Related papers: Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation

Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation

URL: http://arxiv.org/abs/2207.06989v1
Date: Thu, 14 Jul 2022 15:17:19 GMT
Title: Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation
Authors: Min Zhang and Siteng Huang and Wenbin Li and Donglin Wang
Abstract summary: We focus on how to learn additional feature representations for few-shot image classification through pretext tasks. This additional knowledge can further improve the performance of few-shot learning. We present a plug-in Hierarchical Tree Structure-aware (HTS) method, which learns the relationship of FSL and pretext tasks.
Score: 27.868736254566397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we mainly focus on the problem of how to learn additional feature representations for few-shot image classification through pretext tasks (e.g., rotation or color permutation and so on). This additional knowledge generated by pretext tasks can further improve the performance of few-shot learning (FSL) as it differs from human-annotated supervision (i.e., class labels of FSL tasks). To solve this problem, we present a plug-in Hierarchical Tree Structure-aware (HTS) method, which not only learns the relationship of FSL and pretext tasks, but more importantly, can adaptively select and aggregate feature representations generated by pretext tasks to maximize the performance of FSL tasks. A hierarchical tree constructing component and a gated selection aggregating component is introduced to construct the tree structure and find richer transferable knowledge that can rapidly adapt to novel classes with a few labeled images. Extensive experiments show that our HTS can significantly enhance multiple few-shot methods to achieve new state-of-the-art performance on four benchmark datasets. The code is available at: https://github.com/remiMZ/HTS-ECCV22.

Related papers

Learning and Evaluating Hierarchical Feature Representations [3.770103075126785]
We propose a novel framework, Hierarchical Composition of Orthogonal Subspaces (Hier-COS) Hier-COS learns to map deep feature embeddings into a vector space that is, by design, consistent with the structure of a given taxonomy tree. We demonstrate that Hier-COS achieves state-of-the-art hierarchical performance across all the datasets while simultaneously beating top-1 accuracy in all but one case.
arXiv Detail & Related papers (2025-03-10T20:59:41Z)
Learning Visual Hierarchies with Hyperbolic Embeddings [28.35250955426006]
We introduce a learning paradigm that can encode user-defined multi-level visual hierarchies in hyperbolic space without requiring explicit hierarchical labels. We show significant improvements in hierarchical retrieval tasks, demonstrating the capability of our model in capturing visual hierarchies.
arXiv Detail & Related papers (2024-11-26T14:58:06Z)
SAN: Structure-Aware Network for Complex and Long-tailed Chinese Text Recognition [9.190324058948987]
We propose a structure-aware network utilizing the hierarchical composition information to improve the recognition performance of complex characters. Experiments demonstrate that the proposed approach can significantly improve the performances of complex characters and tail characters, yielding a better overall performance.
arXiv Detail & Related papers (2024-11-10T07:41:00Z)
HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding [18.95003393925676]
When classifying categories at different hierarchy levels, traditional uni-modal approaches focus primarily on image features, revealing limitations in complex scenarios. Recent studies integrating Vision-Language Models (VLMs) with class hierarchies have shown promise, yet they fall short of fully exploiting the hierarchical relationships. We propose a novel framework that effectively combines CLIP with a deeper exploitation of the Hierarchical class structure via Graph representation learning.
arXiv Detail & Related papers (2023-11-23T15:42:42Z)
Many or Few Samples? Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification [68.19713459228369]
We compare transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models. We show that (i) using large datasets we can obtain more general representations, (ii) contrastive learning is the best methodology. While ML tree-based cannot handle large tasks but fits well small tasks, by means of reusing learned representations, DL methods are reaching tree-based models performance also for small tasks.
arXiv Detail & Related papers (2023-05-21T11:20:49Z)
Deep Hierarchical Semantic Segmentation [76.40565872257709]
hierarchical semantic segmentation (HSS) aims at structured, pixel-wise description of visual observation in terms of a class hierarchy. HSSN casts HSS as a pixel-wise multi-label classification task, only bringing minimal architecture change to current segmentation models. With hierarchy-induced margin constraints, HSSN reshapes the pixel embedding space, so as to generate well-structured pixel representations.
arXiv Detail & Related papers (2022-03-27T15:47:44Z)
Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition. We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction. We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z)
Attribute Propagation Network for Graph Zero-shot Learning [57.68486382473194]
We introduce the attribute propagation network (APNet), which is composed of 1) a graph propagation model generating attribute vector for each class and 2) a parameterized nearest neighbor (NN) classifier. APNet achieves either compelling performance or new state-of-the-art results in experiments with two zero-shot learning settings and five benchmark datasets.
arXiv Detail & Related papers (2020-09-24T16:53:40Z)
HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation [20.148175528691905]
This paper presents a novel structure-aware embedding-to-classifier(SEC) module to incorporate both local and global structural information of relationships into the output space. We also propose a hierarchical semantic aggregation(HSA) module to reduce the number of subspaces by introducing higher order structural information. The proposed HOSE-Net achieves the state-of-the-art performance on two popular benchmarks of Visual Genome and VRD.
arXiv Detail & Related papers (2020-08-12T07:58:13Z)
Group Based Deep Shared Feature Learning for Fine-grained Image Classification [31.84610555517329]
We present a new deep network architecture that explicitly models shared features and removes their effect to achieve enhanced classification results. We call this framework Group based deep Shared Feature Learning (GSFL) and the resulting learned network as GSFL-Net. A key benefit of our specialized autoencoder is that it is versatile and can be combined with state-of-the-art fine-grained feature extraction models and trained together with them to improve their performance directly.
arXiv Detail & Related papers (2020-04-04T00:01:11Z)
Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features. Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
TAFSSL: Task-Adaptive Feature Sub-Space Learning for few-shot classification [50.358839666165764]
We show that the Task-Adaptive Feature Sub-Space Learning (TAFSSL) can significantly boost the performance in Few-Shot Learning scenarios. Specifically, we show that on the challenging miniImageNet and tieredImageNet benchmarks, TAFSSL can improve the current state-of-the-art in both transductive and semi-supervised FSL settings by more than $5%$.
arXiv Detail & Related papers (2020-03-14T16:59:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.