A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry
- URL: http://arxiv.org/abs/2407.07664v1
- Date: Wed, 10 Jul 2024 13:44:19 GMT
- Title: A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry
- Authors: Martin Lindström, Borja Rodríguez-Gálvez, Ragnar Thobaben, Mikael Skoglund,
- Abstract summary: Hyperspherical Prototypical Learning (HPL) is a supervised approach to representation learning that designs class prototypes on the unit hypersphere.
Previous approaches to HPL have either of the following shortcomings: (i) they follow an unprincipled optimisation procedure; or (ii) they are theoretically sound, but are constrained to only one possible latent dimension.
- Score: 25.514947992281378
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperspherical Prototypical Learning (HPL) is a supervised approach to representation learning that designs class prototypes on the unit hypersphere. The prototypes bias the representations to class separation in a scale invariant and known geometry. Previous approaches to HPL have either of the following shortcomings: (i) they follow an unprincipled optimisation procedure; or (ii) they are theoretically sound, but are constrained to only one possible latent dimension. In this paper, we address both shortcomings. To address (i), we present a principled optimisation procedure whose solution we show is optimal. To address (ii), we construct well-separated prototypes in a wide range of dimensions using linear block codes. Additionally, we give a full characterisation of the optimal prototype placement in terms of achievable and converse bounds, showing that our proposed methods are near-optimal.
Related papers
- Prototype Optimization with Neural ODE for Few-Shot Learning [41.743442773121444]
FewShot Learning is a challenging task, which aims to recognize novel classes with few examples.
Due to the data scarcity, mean-based prototypes are usually biased.
We propose a novel prototype optimization framework to rectify prototypes, i.e., introducing a meta-optimizer to optimize prototypes.
arXiv Detail & Related papers (2024-11-19T06:17:25Z) - Improving generalization in large language models by learning prefix
subspaces [5.911540700785975]
This article focuses on large language models (LLMs) fine-tuning in the scarce data regime (also known as the "few-shot" learning setting)
We propose a method to increase the generalization capabilities of LLMs based on neural network subspaces.
arXiv Detail & Related papers (2023-10-24T12:44:09Z) - Object Representations as Fixed Points: Training Iterative Refinement
Algorithms with Implicit Differentiation [88.14365009076907]
Iterative refinement is a useful paradigm for representation learning.
We develop an implicit differentiation approach that improves the stability and tractability of training.
arXiv Detail & Related papers (2022-07-02T10:00:35Z) - Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes.
Our framework yields compelling results over several datasets.
We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Hyperbolic Busemann Learning with Ideal Prototypes [14.525985704735055]
In this work, we propose Hyperbolic Busemann Learning for representation learning of arbitrary data.
To be able to compute proximities to ideal prototypes, we introduce the penalised Busemann loss.
Empirically, we show that our approach provides a natural interpretation of classification confidence, while outperforming recent hyperspherical and hyperbolic prototype approaches.
arXiv Detail & Related papers (2021-06-28T08:36:59Z) - MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning [15.03769312691378]
Few-Shot Learning is a challenging task, i.e., how to recognize novel classes with few examples.
In this paper, we diminish the bias by regarding it as a prototype optimization problem.
We propose a novel prototype optimization-based meta-learning framework, called MetaNODE.
arXiv Detail & Related papers (2021-03-26T09:16:46Z) - An AI-Assisted Design Method for Topology Optimization Without
Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way.
Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - Convex Representation Learning for Generalized Invariance in
Semi-Inner-Product Space [32.442549424823355]
In this work we develop an algorithm for a variety of generalized representations in a semi-norms that representers in a lead, and bounds are established.
This allows in representations to be learned efficiently and effectively as confirmed in our experiments along with accurate predictions.
arXiv Detail & Related papers (2020-04-25T18:54:37Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.