Related papers: Meta-ticket: Finding optimal subnetworks for few-shot learning within randomly initialized neural networks

Meta-ticket: Finding optimal subnetworks for few-shot learning within randomly initialized neural networks

URL: http://arxiv.org/abs/2205.15619v1
Date: Tue, 31 May 2022 09:03:57 GMT
Title: Meta-ticket: Finding optimal subnetworks for few-shot learning within randomly initialized neural networks
Authors: Daiki Chijiwa, Shin'ya Yamaguchi, Atsutoshi Kumagai, Yasutoshi Ida
Abstract summary: Few-shot learning for neural networks (NNs) is an important problem that aims to train NNs with a few data. We propose a novel meta-learning approach, called Meta-ticket, to find optimal sparseworks for few-shot learning within randomly NNs.
Score: 16.146036509065247
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Few-shot learning for neural networks (NNs) is an important problem that aims to train NNs with a few data. The main challenge is how to avoid overfitting since over-parameterized NNs can easily overfit to such small dataset. Previous work (e.g. MAML by Finn et al. 2017) tackles this challenge by meta-learning, which learns how to learn from a few data by using various tasks. On the other hand, one conventional approach to avoid overfitting is restricting hypothesis spaces by endowing sparse NN structures like convolution layers in computer vision. However, although such manually-designed sparse structures are sample-efficient for sufficiently large datasets, they are still insufficient for few-shot learning. Then the following questions naturally arise: (1) Can we find sparse structures effective for few-shot learning by meta-learning? (2) What benefits will it bring in terms of meta-generalization? In this work, we propose a novel meta-learning approach, called Meta-ticket, to find optimal sparse subnetworks for few-shot learning within randomly initialized NNs. We empirically validated that Meta-ticket successfully discover sparse subnetworks that can learn specialized features for each given task. Due to this task-wise adaptation ability, Meta-ticket achieves superior meta-generalization compared to MAML-based methods especially with large NNs.

Related papers

CPT: Competence-progressive Training Strategy for Few-shot Node Classification [11.17199104891692]
Graph Neural Networks (GNNs) have made significant advancements in node classification, but their success relies on sufficient labeled nodes per class in the training data. Traditional episodic meta-learning approaches have shown promise in this domain, but they face an inherent limitation. We introduce CPT, a novel two-stage curriculum learning method that aligns task difficulty with the meta-learner's progressive competence.
arXiv Detail & Related papers (2024-02-01T09:36:56Z)
An Initialization Schema for Neuronal Networks on Tabular Data [0.9155684383461983]
We show that a binomial neural network can be used effectively on tabular data. The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks. We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches.
arXiv Detail & Related papers (2023-11-07T13:52:35Z)
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks. Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z)
What Can We Learn from Unlearnable Datasets? [107.12337511216228]
Unlearnable datasets have the potential to protect data privacy by preventing deep neural networks from generalizing. It is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization. In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured.
arXiv Detail & Related papers (2023-05-30T17:41:35Z)
Learning to Learn with Indispensable Connections [6.040904021861969]
We propose a novel meta-learning method called Meta-LTH that includes indispensible (necessary) connections. Our method improves the classification accuracy by approximately 2% (20-way 1-shot task setting) for omniglot dataset.
arXiv Detail & Related papers (2023-04-06T04:53:13Z)
You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious. We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem. We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
Generating meta-learning tasks to evolve parametric loss for classification learning [1.1355370218310157]
In existing meta-learning approaches, learning tasks for training meta-models are usually collected from public datasets. We propose a meta-learning approach based on randomly generated meta-learning tasks to obtain a parametric loss for classification learning based on big data.
arXiv Detail & Related papers (2021-11-20T13:07:55Z)
It's Hard for Neural Networks To Learn the Game of Life [4.061135251278187]
Recent findings suggest that neural networks rely on lucky random initial weights of "lottery tickets" that converge quickly to a solution. We examine small convolutional networks that are trained to predict n steps of the two-dimensional cellular automaton Conway's Game of Life. We find that networks of this architecture trained on this task rarely converge.
arXiv Detail & Related papers (2020-09-03T00:47:08Z)
Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion. The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z)
Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks [95.51368472949308]
Adaptation can be useful in cases when training data is scarce, or when one wishes to encode priors in the network. In this paper, we propose a straightforward alternative: side-tuning.
arXiv Detail & Related papers (2019-12-31T18:52:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.