Dataset Condensation with Distribution Matching
- URL: http://arxiv.org/abs/2110.04181v1
- Date: Fri, 8 Oct 2021 15:02:30 GMT
- Title: Dataset Condensation with Distribution Matching
- Authors: Bo Zhao, Hakan Bilen
- Abstract summary: dataset condensation aims to replace the original large training set with a significantly smaller learned synthetic set.
We propose a simple yet effective dataset condensation technique that requires significantly lower training cost.
Thanks to its efficiency, we apply our method to more realistic and larger datasets with sophisticated neural architectures.
- Score: 30.571335208276246
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Computational cost to train state-of-the-art deep models in many learning
problems is rapidly increasing due to more sophisticated models and larger
datasets. A recent promising direction to reduce training time is dataset
condensation that aims to replace the original large training set with a
significantly smaller learned synthetic set while preserving its information.
While training deep models on the small set of condensed images can be
extremely fast, their synthesis remains computationally expensive due to the
complex bi-level optimization and second-order derivative computation. In this
work, we propose a simple yet effective dataset condensation technique that
requires significantly lower training cost with comparable performance by
matching feature distributions of the synthetic and original training images in
sampled embedding spaces. Thanks to its efficiency, we apply our method to more
realistic and larger datasets with sophisticated neural architectures and
achieve a significant performance boost while using larger synthetic training
set. We also show various practical benefits of our method in continual
learning and neural architecture search.
Related papers
- Koopcon: A new approach towards smarter and less complex learning [13.053285552524052]
In the era of big data, the sheer volume and complexity of datasets pose significant challenges in machine learning.
This paper introduces an innovative Autoencoder-based dataset condensation model backed by Koopman operator theory.
Inspired by the predictive coding mechanisms of the human brain, our model leverages a novel approach to encode and reconstruct data.
arXiv Detail & Related papers (2024-05-22T17:47:14Z) - Exploring Learning Complexity for Efficient Downstream Dataset Pruning [8.990878450631596]
Existing dataset pruning methods require training on the entire dataset.
We propose a straightforward, novel, and training-free hardness score named Distorting-based Learning Complexity (DLC)
Our method is motivated by the observation that easy samples learned faster can also be learned with fewer parameters.
arXiv Detail & Related papers (2024-02-08T02:29:33Z) - Data Distillation Can Be Like Vodka: Distilling More Times For Better
Quality [78.6359306550245]
We argue that using just one synthetic subset for distillation will not yield optimal generalization performance.
PDD synthesizes multiple small sets of synthetic images, each conditioned on the previous sets, and trains the model on the cumulative union of these subsets.
Our experiments show that PDD can effectively improve the performance of existing dataset distillation methods by up to 4.3%.
arXiv Detail & Related papers (2023-10-10T20:04:44Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Accelerating Dataset Distillation via Model Augmentation [41.3027484667024]
We propose two model augmentation techniques, i.e. using early-stage models and parameter parameters to learn an informative synthetic set with significantly reduced training cost.
Our method achieves up to 20x speedup and comparable performance on par with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-12T07:36:05Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights.
Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs.
In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z) - Dataset Condensation with Gradient Matching [36.14340188365505]
We propose a training set synthesis technique for data-efficient learning, called dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch.
We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2020-06-10T16:30:52Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.