Dataset Condensation with Contrastive Signals
- URL: http://arxiv.org/abs/2202.02916v1
- Date: Mon, 7 Feb 2022 03:05:32 GMT
- Title: Dataset Condensation with Contrastive Signals
- Authors: Saehyung Lee, Sanghyuk Chun, Sangwon Jung, Sangdoo Yun, Sungroh Yoon
- Abstract summary: gradient matching-based dataset synthesis (DC) methods can achieve state-of-the-art performance when applied to data-efficient learning tasks.
In this study, we prove that the existing DC methods can perform worse than the random selection method when task-irrelevant information forms a significant part of the training dataset.
We propose dataset condensation with Contrastive signals (DCC) by modifying the loss function to enable the DC methods to effectively capture the differences between classes.
- Score: 41.195453119305746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have demonstrated that gradient matching-based dataset
synthesis, or dataset condensation (DC), methods can achieve state-of-the-art
performance when applied to data-efficient learning tasks. However, in this
study, we prove that the existing DC methods can perform worse than the random
selection method when task-irrelevant information forms a significant part of
the training dataset. We attribute this to the lack of participation of the
contrastive signals between the classes resulting from the class-wise gradient
matching strategy. To address this problem, we propose Dataset Condensation
with Contrastive signals (DCC) by modifying the loss function to enable the DC
methods to effectively capture the differences between classes. In addition, we
analyze the new loss function in terms of training dynamics by tracking the
kernel velocity. Furthermore, we introduce a bi-level warm-up strategy to
stabilize the optimization. Our experimental results indicate that while the
existing methods are ineffective for fine-grained image classification tasks,
the proposed method can successfully generate informative synthetic datasets
for the same tasks. Moreover, we demonstrate that the proposed method
outperforms the baselines even on benchmark datasets such as SVHN, CIFAR-10,
and CIFAR-100. Finally, we demonstrate the high applicability of the proposed
method by applying it to continual learning tasks.
Related papers
- UDD: Dataset Distillation via Mining Underutilized Regions [10.034543678588578]
We propose UDD, a novel approach to identify and exploit underutilized regions in synthetic images.
In this paper, we propose UDD, a novel approach to identify and exploit the underutilized regions to make them informative and discriminate.
Our method improves the utilization of the synthetic dataset and outperforms the state-of-the-art methods on various datasets.
arXiv Detail & Related papers (2024-08-29T05:13:01Z) - Heterogeneous Learning Rate Scheduling for Neural Architecture Search on Long-Tailed Datasets [0.0]
We propose a novel adaptive learning rate scheduling strategy tailored for the architecture parameters of DARTS.
Our approach dynamically adjusts the learning rate of the architecture parameters based on the training epoch, preventing the disruption of well-trained representations.
arXiv Detail & Related papers (2024-06-11T07:32:25Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Delving into Effective Gradient Matching for Dataset Condensation [13.75957901381024]
gradient matching method directly targets the training dynamics by matching the gradient when training on the original and synthetic datasets.
We propose to match the multi-level gradients to involve both intra-class and inter-class gradient information.
An overfitting-aware adaptive learning step strategy is also proposed to trim unnecessary optimization steps for algorithmic efficiency improvement.
arXiv Detail & Related papers (2022-07-30T21:31:10Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Unsupervised feature selection via self-paced learning and low-redundant
regularization [6.083524716031565]
An unsupervised feature selection is proposed by integrating the framework of self-paced learning and subspace learning.
The convergence of the method is proved theoretically and experimentally.
The experimental results show that the proposed method can improve the performance of clustering methods and outperform other compared algorithms.
arXiv Detail & Related papers (2021-12-14T08:28:19Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.