MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning
- URL: http://arxiv.org/abs/2103.14341v1
- Date: Fri, 26 Mar 2021 09:16:46 GMT
- Title: MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning
- Authors: Baoquan Zhang, Xutao Li, Yunming Ye, Shanshan Feng, Rui Ye
- Abstract summary: Few-Shot Learning is a challenging task, i.e., how to recognize novel classes with few examples.
In this paper, we diminish the bias by regarding it as a prototype optimization problem.
We propose a novel prototype optimization-based meta-learning framework, called MetaNODE.
- Score: 15.03769312691378
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-Shot Learning (FSL) is a challenging task, i.e., how to recognize novel
classes with few examples? Pre-training based methods effectively tackle the
problem by pre-training a feature extractor and then predict novel classes via
a nearest neighbor classifier with mean-based prototypes. Nevertheless, due to
the data scarcity, the mean-based prototypes are usually biased. In this paper,
we diminish the bias by regarding it as a prototype optimization problem.
Although the existing meta-optimizers can also be applied for the optimization,
they all overlook a crucial gradient bias issue, i.e., the mean-based gradient
estimation is also biased on scarce data. Consequently, we regard the gradient
itself as meta-knowledge and then propose a novel prototype optimization-based
meta-learning framework, called MetaNODE. Specifically, we first regard the
mean-based prototypes as initial prototypes, and then model the process of
prototype optimization as continuous-time dynamics specified by a Neural
Ordinary Differential Equation (Neural ODE). A gradient flow inference network
is carefully designed to learn to estimate the continuous gradients for
prototype dynamics. Finally, the optimal prototypes can be obtained by solving
the Neural ODE using the Runge-Kutta method. Extensive experiments demonstrate
that our proposed method obtains superior performance over the previous
state-of-the-art methods. Our code will be publicly available upon acceptance.
Related papers
- A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry [25.514947992281378]
Hyperspherical Prototypical Learning (HPL) is a supervised approach to representation learning that designs class prototypes on the unit hypersphere.
Previous approaches to HPL have either of the following shortcomings: (i) they follow an unprincipled optimisation procedure; or (ii) they are theoretically sound, but are constrained to only one possible latent dimension.
arXiv Detail & Related papers (2024-07-10T13:44:19Z) - Gradient Guidance for Diffusion Models: An Optimization Perspective [45.6080199096424]
This paper studies a form of gradient guidance for adapting a pre-trained diffusion model towards optimizing user-specified objectives.
We establish a mathematical framework for guided diffusion to systematically study its optimization theory and algorithmic design.
arXiv Detail & Related papers (2024-04-23T04:51:02Z) - Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network.
We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint.
Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Prototype Completion for Few-Shot Learning [13.63424509914303]
Few-shot learning aims to recognize novel classes with few examples.
Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning.
We propose a novel prototype completion based meta-learning framework.
arXiv Detail & Related papers (2021-08-11T03:44:00Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - An AI-Assisted Design Method for Topology Optimization Without
Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way.
Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z) - Optimal 1-NN Prototypes for Pathological Geometries [13.70633147306388]
Using prototype methods to reduce the size of training datasets can drastically reduce the computational cost of classification.
We show that it is difficult to find the optimal prototypes for a given dataset, and algorithms are used instead.
We propose an algorithm for finding nearly-optimal classifier prototypes in this setting, and use it to empirically validate the theoretical results.
arXiv Detail & Related papers (2020-10-31T10:15:08Z) - Prototype Completion with Primitive Knowledge for Few-Shot Learning [20.449056536438658]
Few-shot learning is a challenging task, which aims to learn a classifier for novel classes with few examples.
Pre-training based meta-learning methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning.
We propose a novel prototype completion based meta-learning framework.
arXiv Detail & Related papers (2020-09-10T16:09:34Z) - A Primer on Zeroth-Order Optimization in Signal Processing and Machine
Learning [95.85269649177336]
ZO optimization iteratively performs three major steps: gradient estimation, descent direction, and solution update.
We demonstrate promising applications of ZO optimization, such as evaluating and generating explanations from black-box deep learning models, and efficient online sensor management.
arXiv Detail & Related papers (2020-06-11T06:50:35Z) - Dynamic Scale Training for Object Detection [111.33112051962514]
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.
Experimental results demonstrate the efficacy of our proposed DST towards scale variation handling.
It does not introduce inference overhead and could serve as a free lunch for general detection configurations.
arXiv Detail & Related papers (2020-04-26T16:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.