Neural Fine-Tuning Search for Few-Shot Learning
- URL: http://arxiv.org/abs/2306.09295v1
- Date: Thu, 15 Jun 2023 17:20:35 GMT
- Title: Neural Fine-Tuning Search for Few-Shot Learning
- Authors: Panagiotis Eustratiadis, {\L}ukasz Dudziak, Da Li, Timothy Hospedales
- Abstract summary: In few-shot recognition, a classifier is required to rapidly adapt and generalize to a disjoint, novel set of classes.
Recent studies have shown the efficacy of fine-tuning with carefully crafted adaptation architectures.
We study this question through the lens of neural architecture search (NAS)
- Score: 10.194808064624771
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In few-shot recognition, a classifier that has been trained on one set of
classes is required to rapidly adapt and generalize to a disjoint, novel set of
classes. To that end, recent studies have shown the efficacy of fine-tuning
with carefully crafted adaptation architectures. However this raises the
question of: How can one design the optimal adaptation strategy? In this paper,
we study this question through the lens of neural architecture search (NAS).
Given a pre-trained neural network, our algorithm discovers the optimal
arrangement of adapters, which layers to keep frozen and which to fine-tune. We
demonstrate the generality of our NAS method by applying it to both residual
networks and vision transformers and report state-of-the-art performance on
Meta-Dataset and Meta-Album.
Related papers
- Neural Metamorphosis [72.88137795439407]
This paper introduces a new learning paradigm termed Neural Metamorphosis (NeuMeta), which aims to build self-morphable neural networks.
NeuMeta directly learns the continuous weight manifold of neural networks.
It sustains full-size performance even at a 75% compression rate.
arXiv Detail & Related papers (2024-10-10T14:49:58Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Increasing Depth of Neural Networks for Life-long Learning [2.0305676256390934]
We propose a novel method for continual learning based on the increasing depth of neural networks.
This work explores whether extending neural network depth may be beneficial in a life-long learning setting.
arXiv Detail & Related papers (2022-02-22T11:21:41Z) - Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network.
We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z) - Implementing a foveal-pit inspired filter in a Spiking Convolutional
Neural Network: a preliminary study [0.0]
We have presented a Spiking Convolutional Neural Network (SCNN) that incorporates retinal foveal-pit inspired Difference of Gaussian filters and rank-order encoding.
The model is trained using a variant of the backpropagation algorithm adapted to work with spiking neurons, as implemented in the Nengo library.
The network has achieved up to 90% accuracy, where loss is calculated using the cross-entropy function.
arXiv Detail & Related papers (2021-05-29T15:28:30Z) - Auto-tuning of Deep Neural Networks by Conflicting Layer Removal [0.0]
We introduce a novel methodology to identify layers that decrease the test accuracy of trained models.
Conflicting layers are detected as early as the beginning of training.
We will show that around 60% of the layers of trained residual networks can be completely removed from the architecture.
arXiv Detail & Related papers (2021-03-07T11:51:55Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot
Learning [76.98364915566292]
A common practice is to train a model on the base set first and then transfer to novel classes through fine-tuning.
We propose to transfer partial knowledge by freezing or fine-tuning particular layer(s) in the base model.
We conduct extensive experiments on CUB and mini-ImageNet to demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2021-02-08T03:27:05Z) - Conflicting Bundles: Adapting Architectures Towards the Improved
Training of Deep Neural Networks [1.7188280334580195]
We introduce a novel theory and metric to identify layers that decrease the test accuracy of the trained models.
We identify those layers that worsen the performance because they produce conflicting training bundles.
Based on these findings, a novel algorithm is introduced to remove performance decreasing layers automatically.
arXiv Detail & Related papers (2020-11-05T16:41:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.