Hybrid Feature Collaborative Reconstruction Network for Few-Shot Fine-Grained Image Classification
- URL: http://arxiv.org/abs/2407.02123v1
- Date: Tue, 2 Jul 2024 10:14:00 GMT
- Title: Hybrid Feature Collaborative Reconstruction Network for Few-Shot Fine-Grained Image Classification
- Authors: Shulei Qiu, Wanqi Yang, Ming Yang,
- Abstract summary: We design a new Hybrid Feature Collaborative Reconstruction Network (HFCR-Net) for few-shot fine-grained image classification.
We fuse the channel features and the spatial features to increase the inter-class differences.
Our experiments on three widely used fine-grained datasets demonstrate the effectiveness and superiority of our approach.
- Score: 6.090855292102877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our research focuses on few-shot fine-grained image classification, which faces two major challenges: appearance similarity of fine-grained objects and limited number of samples. To preserve the appearance details of images, traditional feature reconstruction networks usually enhance the representation ability of key features by spatial feature reconstruction and minimizing the reconstruction error. However, we find that relying solely on a single type of feature is insufficient for accurately capturing inter-class differences of fine-grained objects in scenarios with limited samples. In contrast, the introduction of channel features provides additional information dimensions, aiding in better understanding and distinguishing the inter-class differences of fine-grained objects. Therefore, in this paper, we design a new Hybrid Feature Collaborative Reconstruction Network (HFCR-Net) for few-shot fine-grained image classification, which includes a Hybrid Feature Fusion Process (HFFP) and a Hybrid Feature Reconstruction Process (HFRP). In HFRP, we fuse the channel features and the spatial features. Through dynamic weight adjustment, we aggregate the spatial dependencies between arbitrary two positions and the correlations between different channels of each image to increase the inter-class differences. Additionally, we introduce the reconstruction of channel dimension in HFRP. Through the collaborative reconstruction of channel dimension and spatial dimension, the inter-class differences are further increased in the process of support-to-query reconstruction, while the intra-class differences are reduced in the process of query-to-support reconstruction. Ultimately, our extensive experiments on three widely used fine-grained datasets demonstrate the effectiveness and superiority of our approach.
Related papers
- WTDUN: Wavelet Tree-Structured Sampling and Deep Unfolding Network for Image Compressed Sensing [51.94493817128006]
We propose a novel wavelet-domain deep unfolding framework named WTDUN, which operates directly on the multi-scale wavelet subbands.
Our method utilizes the intrinsic sparsity and multi-scale structure of wavelet coefficients to achieve a tree-structured sampling and reconstruction.
arXiv Detail & Related papers (2024-11-25T12:31:03Z) - Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound.
We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements.
To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - Transformer-empowered Multi-scale Contextual Matching and Aggregation
for Multi-contrast MRI Super-resolution [55.52779466954026]
Multi-contrast super-resolution (SR) reconstruction is promising to yield SR images with higher quality.
Existing methods lack effective mechanisms to match and fuse these features for better reconstruction.
We propose a novel network to address these problems by developing a set of innovative Transformer-empowered multi-scale contextual matching and aggregation techniques.
arXiv Detail & Related papers (2022-03-26T01:42:59Z) - Cross-SRN: Structure-Preserving Super-Resolution Network with Cross
Convolution [64.76159006851151]
It is challenging to restore low-resolution (LR) images to super-resolution (SR) images with correct and clear details.
Existing deep learning works almost neglect the inherent structural information of images.
We design a hierarchical feature exploitation network to probe and preserve structural information.
arXiv Detail & Related papers (2022-01-05T05:15:01Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Multi-Scale Cascading Network with Compact Feature Learning for
RGB-Infrared Person Re-Identification [35.55895776505113]
Multi-Scale Part-Aware Cascading framework (MSPAC) is formulated by aggregating multi-scale fine-grained features from part to global.
Cross-modality correlations can thus be efficiently explored on salient features for distinctive modality-invariant feature learning.
arXiv Detail & Related papers (2020-12-12T15:39:11Z) - Learning Disentangled Feature Representation for Hybrid-distorted Image
Restoration [41.99534893489878]
We introduce the concept of Disentangled Feature Learning to achieve the feature-level divide-and-conquer of hybrid distortions.
Specifically, we propose the feature disentanglement module (FDM) to distribute feature representations of different distortions into different channels.
We also propose a feature aggregation module (FAM) with channel-wise attention to adaptively filter out the distortion representations.
arXiv Detail & Related papers (2020-07-22T13:43:40Z) - AdaptiveWeighted Attention Network with Camera Spectral Sensitivity
Prior for Spectral Reconstruction from RGB Images [22.26917280683572]
We propose a novel adaptive weighted attention network (AWAN) for spectral reconstruction.
AWCA and PSNL modules are developed to reallocate channel-wise feature responses.
In the NTIRE 2020 Spectral Reconstruction Challenge, our entries obtain the 1st ranking on the Clean track and the 3rd place on the Real World track.
arXiv Detail & Related papers (2020-05-19T09:21:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.