Transductive Few-Shot Classification on the Oblique Manifold
- URL: http://arxiv.org/abs/2108.04009v1
- Date: Mon, 9 Aug 2021 13:01:03 GMT
- Title: Transductive Few-Shot Classification on the Oblique Manifold
- Authors: Guodong Qi, Huimin Yu, Zhaohui Lu, Shuzhao Li
- Abstract summary: Few-shot learning attempts to learn with limited data.
In this work, we perform the feature extraction in the Euclidean space.
We also propose a non-parametric Region Self-attention with Spatial Pyramid Pooling.
- Score: 5.115651633703363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot learning (FSL) attempts to learn with limited data. In this work, we
perform the feature extraction in the Euclidean space and the geodesic distance
metric on the Oblique Manifold (OM). Specially, for better feature extraction,
we propose a non-parametric Region Self-attention with Spatial Pyramid Pooling
(RSSPP), which realizes a trade-off between the generalization and the
discriminative ability of the single image feature. Then, we embed the feature
to OM as a point. Furthermore, we design an Oblique Distance-based Classifier
(ODC) that achieves classification in the tangent spaces which better
approximate OM locally by learnable tangency points. Finally, we introduce a
new method for parameters initialization and a novel loss function in the
transductive settings. Extensive experiments demonstrate the effectiveness of
our algorithm and it outperforms state-of-the-art methods on the popular
benchmarks: mini-ImageNet, tiered-ImageNet, and Caltech-UCSD Birds-200-2011
(CUB).
Related papers
- Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - FocDepthFormer: Transformer with latent LSTM for Depth Estimation from Focal Stack [11.433602615992516]
We present a novel Transformer-based network, FocDepthFormer, which integrates a Transformer with an LSTM module and a CNN decoder.
By incorporating the LSTM, FocDepthFormer can be pre-trained on large-scale monocular RGB depth estimation datasets.
Our model outperforms state-of-the-art approaches across multiple evaluation metrics.
arXiv Detail & Related papers (2023-10-17T11:53:32Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Improving Pixel-based MIM by Reducing Wasted Modeling Capability [77.99468514275185]
We propose a new method that explicitly utilizes low-level features from shallow layers to aid pixel reconstruction.
To the best of our knowledge, we are the first to systematically investigate multi-level feature fusion for isotropic architectures.
Our method yields significant performance gains, such as 1.2% on fine-tuning, 2.8% on linear probing, and 2.6% on semantic segmentation.
arXiv Detail & Related papers (2023-08-01T03:44:56Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Federated Representation Learning via Maximal Coding Rate Reduction [109.26332878050374]
We propose a methodology to learn low-dimensional representations from a dataset that is distributed among several clients.
Our proposed method, which we refer to as FLOW, utilizes MCR2 as the objective of choice, hence resulting in representations that are both between-class discriminative and within-class compressible.
arXiv Detail & Related papers (2022-10-01T15:43:51Z) - Efficient Deep Feature Calibration for Cross-Modal Joint Embedding
Learning [14.070841236184439]
This paper introduces a two-phase deep feature calibration framework for efficient learning of semantics enhanced text-image cross-modal joint embedding.
In preprocessing, we perform deep feature calibration by combining deep feature engineering with semantic context features derived from raw text-image input data.
In joint embedding learning, we perform deep feature calibration by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling.
arXiv Detail & Related papers (2021-08-02T08:16:58Z) - Facilitate the Parametric Dimension Reduction by Gradient Clipping [1.9671123873378715]
We extend a well-known dimension reduction method, t-distributed neighbor embedding (t-SNE), from non-parametric to parametric by training neural networks.
Our method achieves an embedding quality that is compatible with the non-parametric t-SNE while enjoying the ability of generalization.
arXiv Detail & Related papers (2020-09-30T01:21:22Z) - Scan-based Semantic Segmentation of LiDAR Point Clouds: An Experimental
Study [2.6205925938720833]
State of the art methods use deep neural networks to predict semantic classes for each point in a LiDAR scan.
A powerful and efficient way to process LiDAR measurements is to use two-dimensional, image-like projections.
We demonstrate various techniques to boost the performance and to improve runtime as well as memory constraints.
arXiv Detail & Related papers (2020-04-06T11:08:12Z) - DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning [122.51237307910878]
We develop methods for few-shot image classification from a new perspective of optimal matching between image regions.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations.
To generate the important weights of elements in the formulation, we design a cross-reference mechanism.
arXiv Detail & Related papers (2020-03-15T08:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.