MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
- URL: http://arxiv.org/abs/2307.16424v2
- Date: Mon, 8 Jan 2024 10:47:20 GMT
- Title: MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning
- Authors: Baoquan Zhang, Chuyao Luo, Demin Yu, Huiwei Lin, Xutao Li, Yunming Ye,
Bowen Zhang
- Abstract summary: Gradient-based meta-learning approaches effectively address the challenge by learning how to learn novel tasks.
We present a novel task-conditional diffusion-based meta-learning, called MetaDiff, that effectively models the optimization process of model weights.
Experiment results show that our MetaDiff outperforms the state-of-the-art gradient-based meta-learning family in few-shot learning tasks.
- Score: 19.57633448737394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Equipping a deep model the abaility of few-shot learning, i.e., learning
quickly from only few examples, is a core challenge for artificial
intelligence. Gradient-based meta-learning approaches effectively address the
challenge by learning how to learn novel tasks. Its key idea is learning a deep
model in a bi-level optimization manner, where the outer-loop process learns a
shared gradient descent algorithm (i.e., its hyperparameters), while the
inner-loop process leverage it to optimize a task-specific model by using only
few labeled data. Although these existing methods have shown superior
performance, the outer-loop process requires calculating second-order
derivatives along the inner optimization path, which imposes considerable
memory burdens and the risk of vanishing gradients. Drawing inspiration from
recent progress of diffusion models, we find that the inner-loop gradient
descent process can be actually viewed as a reverse process (i.e., denoising)
of diffusion where the target of denoising is model weights but the origin
data. Based on this fact, in this paper, we propose to model the gradient
descent optimizer as a diffusion model and then present a novel
task-conditional diffusion-based meta-learning, called MetaDiff, that
effectively models the optimization process of model weights from Gaussion
noises to target weights in a denoising manner. Thanks to the training
efficiency of diffusion models, our MetaDiff do not need to differentiate
through the inner-loop path such that the memory burdens and the risk of
vanishing gradients can be effectvely alleviated. Experiment results show that
our MetaDiff outperforms the state-of-the-art gradient-based meta-learning
family in few-shot learning tasks.
Related papers
- Unified Gradient-Based Machine Unlearning with Remain Geometry Enhancement [29.675650285351768]
Machine unlearning (MU) has emerged to enhance the privacy and trustworthiness of deep neural networks.
Approximate MU is a practical method for large-scale models.
We propose a fast-slow parameter update strategy to implicitly approximate the up-to-date salient unlearning direction.
arXiv Detail & Related papers (2024-09-29T15:17:33Z) - An Effective Dynamic Gradient Calibration Method for Continual Learning [11.555822066922508]
Continual learning (CL) is a fundamental topic in machine learning, where the goal is to train a model with continuously incoming data and tasks.
Due to the memory limit, we cannot store all the historical data, and therefore confront the catastrophic forgetting'' problem.
We develop an effective algorithm to calibrate the gradient in each updating step of the model.
arXiv Detail & Related papers (2024-07-30T16:30:09Z) - ExposureDiffusion: Learning to Expose for Low-light Image Enhancement [87.08496758469835]
This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure model.
Our method obtains significantly improved performance and reduced inference time compared with vanilla diffusion models.
The proposed framework can work with both real-paired datasets, SOTA noise models, and different backbone networks.
arXiv Detail & Related papers (2023-07-15T04:48:35Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Variational Gaussian Process Diffusion Processes [17.716059928867345]
Diffusion processes are a class of differential equations (SDEs) providing a rich family of expressive models.
Probabilistic inference and learning under generative models with latent processes endowed with a non-linear diffusion process prior are intractable problems.
We build upon work within variational inference, approximating the posterior process as a linear diffusion process, and point out pathologies in the approach.
arXiv Detail & Related papers (2023-06-03T09:43:59Z) - A Novel Noise Injection-based Training Scheme for Better Model
Robustness [9.749718440407811]
Noise injection-based method has been shown to be able to improve the robustness of artificial neural networks.
In this work, we propose a novel noise injection-based training scheme for better model robustness.
Experiment results show that our proposed method achieves a much better performance on adversarial robustness and slightly better performance on original accuracy.
arXiv Detail & Related papers (2023-02-17T02:50:25Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - Improving Deep Learning Interpretability by Saliency Guided Training [36.782919916001624]
Saliency methods have been widely used to highlight important input features in model predictions.
Most existing methods use backpropagation on a modified gradient function to generate saliency maps.
We introduce a saliency guided training procedure for neural networks to reduce noisy gradients used in predictions.
arXiv Detail & Related papers (2021-11-29T06:05:23Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.