Few-shot Classification with Hypersphere Modeling of Prototypes
- URL: http://arxiv.org/abs/2211.05319v1
- Date: Thu, 10 Nov 2022 03:46:02 GMT
- Title: Few-shot Classification with Hypersphere Modeling of Prototypes
- Authors: Ning Ding, Yulin Chen, Ganqu Cui, Xiaobin Wang, Hai-Tao Zheng, Zhiyuan
Liu, Pengjun Xie
- Abstract summary: Metric-based meta-learning is one of the de facto standards in few-shot learning.
We use tensor fields (areas'') to model classes from the geometrical perspective for few-shot learning.
We present a simple and effective method, dubbed hypersphere prototypes (HyperProto)
- Score: 45.211350826691856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Metric-based meta-learning is one of the de facto standards in few-shot
learning. It composes of representation learning and metrics calculation
designs. Previous works construct class representations in different ways,
varying from mean output embedding to covariance and distributions. However,
using embeddings in space lacks expressivity and cannot capture class
information robustly, while statistical complex modeling poses difficulty to
metric designs. In this work, we use tensor fields (``areas'') to model classes
from the geometrical perspective for few-shot learning. We present a simple and
effective method, dubbed hypersphere prototypes (HyperProto), where class
information is represented by hyperspheres with dynamic sizes with two sets of
learnable parameters: the hypersphere's center and the radius. Extending from
points to areas, hyperspheres are much more expressive than embeddings.
Moreover, it is more convenient to perform metric-based classification with
hypersphere prototypes than statistical modeling, as we only need to calculate
the distance from a data point to the surface of the hypersphere. Following
this idea, we also develop two variants of prototypes under other measurements.
Extensive experiments and analysis on few-shot learning tasks across NLP and CV
and comparison with 20+ competitive baselines demonstrate the effectiveness of
our approach.
Related papers
- Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal
Prototypes [7.665392786787577]
We use hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches.
We extend the Masked Siamese Networks to operate on the Poincar'e ball model of hyperbolic space.
Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic.
arXiv Detail & Related papers (2023-05-18T12:38:40Z) - Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes.
Our framework yields compelling results over several datasets.
We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z) - Hyperbolic Vision Transformers: Combining Improvements in Metric
Learning [116.13290702262248]
We propose a new hyperbolic-based model for metric learning.
At the core of our method is a vision transformer with output embeddings mapped to hyperbolic space.
We evaluate the proposed model with six different formulations on four datasets.
arXiv Detail & Related papers (2022-03-21T09:48:23Z) - Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes [17.217954254022573]
We introduce a meta-learning-based method for few-shot 3D shape segmentation where only a few labeled samples are provided for the unseen classes.
We demonstrate the superior performance of our proposed on the ShapeNet part dataset under the few-shot scenario, compared with well-established baseline and state-of-the-art semi-supervised methods.
arXiv Detail & Related papers (2021-07-07T01:47:00Z) - Hyperbolic Busemann Learning with Ideal Prototypes [14.525985704735055]
In this work, we propose Hyperbolic Busemann Learning for representation learning of arbitrary data.
To be able to compute proximities to ideal prototypes, we introduce the penalised Busemann loss.
Empirically, we show that our approach provides a natural interpretation of classification confidence, while outperforming recent hyperspherical and hyperbolic prototype approaches.
arXiv Detail & Related papers (2021-06-28T08:36:59Z) - A Fully Hyperbolic Neural Model for Hierarchical Multi-Class
Classification [7.8176853587105075]
Hyperbolic spaces offer a mathematically appealing approach for learning hierarchical representations of symbolic data.
This work proposes a fully hyperbolic model for multi-class multi-label classification, which performs all operations in hyperbolic space.
A thorough analysis sheds light on the impact of each component in the final prediction and showcases its ease of integration with Euclidean layers.
arXiv Detail & Related papers (2020-10-05T14:42:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.