Related papers: MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning

MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning

URL: http://arxiv.org/abs/2103.14341v1
Date: Fri, 26 Mar 2021 09:16:46 GMT
Title: MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning
Authors: Baoquan Zhang, Xutao Li, Yunming Ye, Shanshan Feng, Rui Ye
Abstract summary: Few-Shot Learning is a challenging task, i.e., how to recognize novel classes with few examples. In this paper, we diminish the bias by regarding it as a prototype optimization problem. We propose a novel prototype optimization-based meta-learning framework, called MetaNODE.
Score: 15.03769312691378
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-Shot Learning (FSL) is a challenging task, i.e., how to recognize novel classes with few examples? Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then predict novel classes via a nearest neighbor classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we diminish the bias by regarding it as a prototype optimization problem. Although the existing meta-optimizers can also be applied for the optimization, they all overlook a crucial gradient bias issue, i.e., the mean-based gradient estimation is also biased on scarce data. Consequently, we regard the gradient itself as meta-knowledge and then propose a novel prototype optimization-based meta-learning framework, called MetaNODE. Specifically, we first regard the mean-based prototypes as initial prototypes, and then model the process of prototype optimization as continuous-time dynamics specified by a Neural Ordinary Differential Equation (Neural ODE). A gradient flow inference network is carefully designed to learn to estimate the continuous gradients for prototype dynamics. Finally, the optimal prototypes can be obtained by solving the Neural ODE using the Runge-Kutta method. Extensive experiments demonstrate that our proposed method obtains superior performance over the previous state-of-the-art methods. Our code will be publicly available upon acceptance.

Related papers

Probabilistic Prototype Calibration of Vision-Language Models for Generalized Few-shot Semantic Segmentation [75.18058114915327]
Generalized Few-Shot Semanticnative (GFSS) aims to extend a segmentation model to novel classes with only a few annotated examples.<n>We propose FewCLIP, a probabilistic prototype calibration framework over multi-modal prototypes from the pretrained CLIP.<n>We show FewCLIP significantly outperforms state-of-the-art approaches across both GFSS and class-incremental setting.
arXiv Detail & Related papers (2025-06-28T18:36:22Z)
Revisiting the Initial Steps in Adaptive Gradient Descent Optimization [6.468625143772815]
Adaptive gradient optimization methods, such as Adam, are prevalent in training deep neural networks across diverse machine learning tasks. These methods often suffer from suboptimal generalization compared to descent gradient (SGD) and exhibit instability. We introduce simple yet effective solutions: initializing the second-order moment estimation with non-zero values.
arXiv Detail & Related papers (2024-12-03T04:28:14Z)
Prototype Optimization with Neural ODE for Few-Shot Learning [41.743442773121444]
FewShot Learning is a challenging task, which aims to recognize novel classes with few examples. Due to the data scarcity, mean-based prototypes are usually biased. We propose a novel prototype optimization framework to rectify prototypes, i.e., introducing a meta-optimizer to optimize prototypes.
arXiv Detail & Related papers (2024-11-19T06:17:25Z)
A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry [25.514947992281378]
Hyperspherical Prototypical Learning (HPL) is a supervised approach to representation learning that designs class prototypes on the unit hypersphere. Previous approaches to HPL have either of the following shortcomings: (i) they follow an unprincipled optimisation procedure; or (ii) they are theoretically sound, but are constrained to only one possible latent dimension.
arXiv Detail & Related papers (2024-07-10T13:44:19Z)
Gradient Guidance for Diffusion Models: An Optimization Perspective [45.6080199096424]
This paper studies a form of gradient guidance for adapting a pre-trained diffusion model towards optimizing user-specified objectives. We establish a mathematical framework for guided diffusion to systematically study its optimization theory and algorithmic design.
arXiv Detail & Related papers (2024-04-23T04:51:02Z)
Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network. We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint. Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Prototype Completion for Few-Shot Learning [13.63424509914303]
Few-shot learning aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning. We propose a novel prototype completion based meta-learning framework.
arXiv Detail & Related papers (2021-08-11T03:44:00Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
An AI-Assisted Design Method for Topology Optimization Without Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way. Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z)
Prototype Completion with Primitive Knowledge for Few-Shot Learning [20.449056536438658]
Few-shot learning is a challenging task, which aims to learn a classifier for novel classes with few examples. Pre-training based meta-learning methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning. We propose a novel prototype completion based meta-learning framework.
arXiv Detail & Related papers (2020-09-10T16:09:34Z)
A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning [95.85269649177336]
ZO optimization iteratively performs three major steps: gradient estimation, descent direction, and solution update. We demonstrate promising applications of ZO optimization, such as evaluating and generating explanations from black-box deep learning models, and efficient online sensor management.
arXiv Detail & Related papers (2020-06-11T06:50:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.