Dataset Meta-Learning from Kernel Ridge-Regression
- URL: http://arxiv.org/abs/2011.00050v3
- Date: Mon, 22 Mar 2021 19:15:46 GMT
- Title: Dataset Meta-Learning from Kernel Ridge-Regression
- Authors: Timothy Nguyen, Zhourong Chen, Jaehoon Lee
- Abstract summary: Kernel Inducing Points (KIP) can compress datasets by one or two orders of magnitude.
KIP-learned datasets are transferable to the training of finite-width neural networks even beyond the lazy-training regime.
- Score: 18.253682891579402
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the most fundamental aspects of any machine learning algorithm is the
training data used by the algorithm. We introduce the novel concept of
$\epsilon$-approximation of datasets, obtaining datasets which are much smaller
than or are significant corruptions of the original training data while
maintaining similar model performance. We introduce a meta-learning algorithm
called Kernel Inducing Points (KIP) for obtaining such remarkable datasets,
inspired by the recent developments in the correspondence between
infinitely-wide neural networks and kernel ridge-regression (KRR). For KRR
tasks, we demonstrate that KIP can compress datasets by one or two orders of
magnitude, significantly improving previous dataset distillation and subset
selection methods while obtaining state of the art results for MNIST and
CIFAR-10 classification. Furthermore, our KIP-learned datasets are transferable
to the training of finite-width neural networks even beyond the lazy-training
regime, which leads to state of the art results for neural network dataset
distillation with potential applications to privacy-preservation.
Related papers
- Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - Iterative self-transfer learning: A general methodology for response
time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study.
The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Dataset Distillation using Neural Feature Regression [32.53291298089172]
We develop an algorithm for dataset distillation using neural Feature Regression with Pooling (FRePo)
FRePo achieves state-of-the-art performance with an order of magnitude less memory requirement and two orders of magnitude faster training than previous methods.
We show that high-quality distilled data can greatly improve various downstream applications, such as continual learning and membership inference defense.
arXiv Detail & Related papers (2022-06-01T19:02:06Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Graph-based Active Learning for Semi-supervised Classification of SAR
Data [8.92985438874948]
We present a novel method for classification of Synthetic Aperture Radar (SAR) data by combining ideas from graph-based learning and neural network methods.
CNNVAE feature embedding and graph construction requires no labeled data, which reduces overfitting.
The method easily incorporates a human-in-the-loop for active learning in the data-labeling process.
arXiv Detail & Related papers (2022-03-31T00:14:06Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Collaborative Method for Incremental Learning on Classification and
Generation [32.07222897378187]
We introduce a novel algorithm, Incremental Class Learning with Attribute Sharing (ICLAS), for incremental class learning with deep neural networks.
As one of its component, incGAN, can generate images with increased variety compared with the training data.
Under challenging environment of data deficiency, ICLAS incrementally trains classification and the generation networks.
arXiv Detail & Related papers (2020-10-29T06:34:53Z) - Dataset Condensation with Gradient Matching [36.14340188365505]
We propose a training set synthesis technique for data-efficient learning, called dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch.
We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-10T16:30:52Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.