Learning a Universal Template for Few-shot Dataset Generalization
- URL: http://arxiv.org/abs/2105.07029v1
- Date: Fri, 14 May 2021 18:46:06 GMT
- Title: Learning a Universal Template for Few-shot Dataset Generalization
- Authors: Eleni Triantafillou, Hugo Larochelle, Richard Zemel and Vincent
Dumoulin
- Abstract summary: Few-shot dataset generalization is a challenging variant of the well-studied few-shot classification problem.
We propose to utilize a diverse training set to construct a universal template that can define a wide array of dataset-specialized models.
Our approach is more parameter-efficient, scalable and adaptable compared to previous methods, and achieves the state-of-the-art on the challenging Meta-Dataset benchmark.
- Score: 25.132729497191047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot dataset generalization is a challenging variant of the well-studied
few-shot classification problem where a diverse training set of several
datasets is given, for the purpose of training an adaptable model that can then
learn classes from new datasets using only a few examples. To this end, we
propose to utilize the diverse training set to construct a universal template:
a partial model that can define a wide array of dataset-specialized models, by
plugging in appropriate components. For each new few-shot classification
problem, our approach therefore only requires inferring a small number of
parameters to insert into the universal template. We design a separate network
that produces an initialization of those parameters for each given task, and we
then fine-tune its proposed initialization via a few steps of gradient descent.
Our approach is more parameter-efficient, scalable and adaptable compared to
previous methods, and achieves the state-of-the-art on the challenging
Meta-Dataset benchmark.
Related papers
- Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - EASY: Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art
Few-Shot Classification with Simple Ingredients [2.0935101589828244]
Few-shot learning aims at leveraging knowledge learned by one or more deep learning models, in order to obtain good classification performance on new problems.
We propose a simple methodology, that reaches or even beats state of the art performance on multiple standardized benchmarks of the field.
arXiv Detail & Related papers (2022-01-24T14:08:23Z) - Hierarchical Few-Shot Generative Models [18.216729811514718]
We study a latent variables approach that extends the Neural Statistician to a fully hierarchical approach with an attention-based point to set-level aggregation.
Our results show that the hierarchical formulation better captures the intrinsic variability within the sets in the small data regime.
arXiv Detail & Related papers (2021-10-23T19:19:39Z) - Single-dataset Experts for Multi-dataset Question Answering [6.092171111087768]
We train a network on multiple datasets to generalize and transfer better to new datasets.
Our approach is to model multi-dataset question answering with a collection of single-dataset experts.
Simple methods based on parameter-averaging lead to better zero-shot generalization and few-shot transfer performance.
arXiv Detail & Related papers (2021-09-28T17:08:22Z) - Data Summarization via Bilevel Optimization [48.89977988203108]
A simple yet powerful approach is to operate on small subsets of data.
In this work, we propose a generic coreset framework that formulates the coreset selection as a cardinality-constrained bilevel optimization problem.
arXiv Detail & Related papers (2021-09-26T09:08:38Z) - One-shot Learning with Absolute Generalization [23.77607345586489]
We propose a set of definitions to explain what kind of datasets can support one-shot learning.
Based on these definitions, we proposed a method to build an absolutely generalizable classifier.
Experiments demonstrate that the proposed method is superior to baseline on one-shot learning datasets and artificial datasets.
arXiv Detail & Related papers (2021-05-28T02:52:52Z) - Fuzzy Simplicial Networks: A Topology-Inspired Model to Improve Task
Generalization in Few-shot Learning [1.0062040918634414]
Few-shot learning algorithms are designed to generalize well to new tasks with limited data.
We introduce a new few-shot model called Fuzzy Simplicial Networks (FSN) which leverages a construction from topology to more flexibly represent each class from limited data.
arXiv Detail & Related papers (2020-09-23T17:01:09Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z) - Feature Transformation Ensemble Model with Batch Spectral Regularization
for Cross-Domain Few-Shot Classification [66.91839845347604]
We propose an ensemble prediction model by performing diverse feature transformations after a feature extraction network.
We use a batch spectral regularization term to suppress the singular values of the feature matrix during pre-training to improve the generalization ability of the model.
The proposed model can then be fine tuned in the target domain to address few-shot classification.
arXiv Detail & Related papers (2020-05-18T05:31:04Z) - Selecting Relevant Features from a Multi-domain Representation for
Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches.
We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.