Prototype Optimization with Neural ODE for Few-Shot Learning
        - URL: http://arxiv.org/abs/2411.12259v1
- Date: Tue, 19 Nov 2024 06:17:25 GMT
- Title: Prototype Optimization with Neural ODE for Few-Shot Learning
- Authors: Baoquan Zhang, Shanshan Feng, Bingqi Shan, Xutao Li, Yunming Ye, Yew-Soon Ong, 
- Abstract summary: FewShot Learning is a challenging task, which aims to recognize novel classes with few examples.
Due to the data scarcity, mean-based prototypes are usually biased.
We propose a novel prototype optimization framework to rectify prototypes, i.e., introducing a meta-optimizer to optimize prototypes.
- Score: 41.743442773121444
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Few-Shot Learning (FSL) is a challenging task, which aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then performing class prediction via a cosine classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt to diminish the prototype bias by regarding it as a prototype optimization problem. To this end, we propose a novel prototype optimization framework to rectify prototypes, i.e., introducing a meta-optimizer to optimize prototypes. Although the existing meta-optimizers can also be adapted to our framework, they all overlook a crucial gradient bias issue, i.e., the mean-based gradient estimation is also biased on sparse data. To address this issue, in this paper, we regard the gradient and its flow as meta-knowledge and then propose a novel Neural Ordinary Differential Equation (ODE)-based meta-optimizer to optimize prototypes, called MetaNODE. Although MetaNODE has shown superior performance, it suffers from a huge computational burden. To further improve its computation efficiency, we conduct a detailed analysis on MetaNODE and then design an effective and efficient MetaNODE extension version (called E2MetaNODE). It consists of two novel modules: E2GradNet and E2Solver, which aim to estimate accurate gradient flows and solve optimal prototypes in an effective and efficient manner, respectively. Extensive experiments show that 1) our methods achieve superior performance over previous FSL methods and 2) our E2MetaNODE significantly improves computation efficiency meanwhile without performance degradation. 
 
      
        Related papers
        - Revisiting the Initial Steps in Adaptive Gradient Descent Optimization [6.468625143772815]
 Adaptive gradient optimization methods, such as Adam, are prevalent in training deep neural networks across diverse machine learning tasks.
These methods often suffer from suboptimal generalization compared to descent gradient (SGD) and exhibit instability.
We introduce simple yet effective solutions: initializing the second-order moment estimation with non-zero values.
 arXiv  Detail & Related papers  (2024-12-03T04:28:14Z)
- Improving Instance Optimization in Deformable Image Registration with   Gradient Projection [7.6061804149819885]
 Deformable image registration is inherently a multi-objective optimization problem.
These conflicting objectives often lead to poor optimization outcomes.
Deep learning methods have recently gained popularity in this domain due to their efficiency in processing large datasets.
 arXiv  Detail & Related papers  (2024-10-21T08:27:13Z)
- ELRA: Exponential learning rate adaption gradient descent optimization
  method [83.88591755871734]
 We present a novel, fast (exponential rate), ab initio (hyper-free) gradient based adaption.
The main idea of the method is to adapt the $alpha by situational awareness.
It can be applied to problems of any dimensions n and scales only linearly.
 arXiv  Detail & Related papers  (2023-09-12T14:36:13Z)
- Bidirectional Looking with A Novel Double Exponential Moving Average to
  Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
 We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework.
We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
 arXiv  Detail & Related papers  (2023-07-02T18:16:06Z)
- End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
 This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
 arXiv  Detail & Related papers  (2023-05-25T10:58:46Z)
- Conservative Objective Models for Effective Offline Model-Based
  Optimization [78.19085445065845]
 Computational design problems arise in a number of settings, from synthetic biology to computer architectures.
We propose a method that learns a model of the objective function that lower bounds the actual value of the ground-truth objective on out-of-distribution inputs.
COMs are simple to implement and outperform a number of existing methods on a wide range of MBO problems.
 arXiv  Detail & Related papers  (2021-07-14T17:55:28Z)
- SHINE: SHaring the INverse Estimate from the forward pass for bi-level
  optimization and implicit models [15.541264326378366]
 In recent years, implicit deep learning has emerged as a method to increase the depth of deep neural networks.
The training is performed as a bi-level problem, and its computational complexity is partially driven by the iterative inversion of a huge Jacobian matrix.
We propose a novel strategy to tackle this computational bottleneck from which many bi-level problems suffer.
 arXiv  Detail & Related papers  (2021-06-01T15:07:34Z)
- Meta Hamiltonian Learning [0.0]
 We use a machine learning technique known as meta-learning to learn a more efficient drifting for this task.
We observe that the meta-optimizer outperforms other optimization methods in average loss over test samples.
 arXiv  Detail & Related papers  (2021-04-09T16:01:34Z)
- MetaNODE: Prototype Optimization as a Neural ODE for Few-Shot Learning [15.03769312691378]
 Few-Shot Learning is a challenging task, i.e., how to recognize novel classes with few examples.
In this paper, we diminish the bias by regarding it as a prototype optimization problem.
We propose a novel prototype optimization-based meta-learning framework, called MetaNODE.
 arXiv  Detail & Related papers  (2021-03-26T09:16:46Z)
- Meta Learning Black-Box Population-Based Optimizers [0.0]
 We propose the use of meta-learning to infer population-based blackbox generalizations.
We show that the meta-loss function encourages a learned algorithm to alter its search behavior so that it can easily fit into a new context.
 arXiv  Detail & Related papers  (2021-03-05T08:13:25Z)
- Meta-Learning with Neural Tangent Kernels [58.06951624702086]
 We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
 arXiv  Detail & Related papers  (2021-02-07T20:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.