Augmentation with Projection: Towards an Effective and Efficient Data
Augmentation Paradigm for Distillation
- URL: http://arxiv.org/abs/2210.11768v1
- Date: Fri, 21 Oct 2022 07:08:31 GMT
- Title: Augmentation with Projection: Towards an Effective and Efficient Data
Augmentation Paradigm for Distillation
- Authors: Ziqi Wang, Yuexin Wu, Frederick Liu, Daogao Liu, Le Hou, Hongkun Yu,
Jing Li, Heng Ji
- Abstract summary: AugPro (Augmentation with Projection) is an effective and efficient data augmentation method for distillation.
Our method builds on top of representation augmentation methods to maintain the diversity expressions.
Results on multiple GLUE tasks show that our methods can improve distillation performance by a large margin at a low time cost.
- Score: 47.31894017472831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge distillation is one of the primary methods of transferring
knowledge from large to small models. However, it requires massive
task-specific data, which may not be plausible in many real-world applications.
Data augmentation methods such as representation interpolation, token
replacement, or augmentation with models are applied to tackle this problem.
However, these data augmentation methods either potentially cause shifts in
decision boundaries (representation interpolation), are not expressive enough
(token replacement), or introduce too much computational overhead (augmentation
with models). To this end, we propose AugPro (Augmentation with Projection), an
effective and efficient data augmentation method for distillation. Our method
builds on top of representation interpolation augmentation methods to maintain
the diversity of expressions and converts the augmented data to tokens to avoid
shifting decision boundaries. It uses simple operations that come with little
computational overhead. The results on multiple GLUE tasks show that our
methods can improve distillation performance by a large margin at a low time
cost.
Related papers
- Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization [34.79567392368196]
We propose a novel framework to existing diffusion-based distillation methods, leveraging diffusion models for selection rather than generation.
Our method starts by predicting noise generated by the diffusion model based on input images and text prompts, then calculates the corresponding loss for each pair.
This streamlined framework enables a single-step distillation process, and extensive experiments demonstrate that our approach outperforms state-of-the-art methods across various metrics.
arXiv Detail & Related papers (2024-12-13T08:34:46Z) - SAFLEX: Self-Adaptive Augmentation via Feature Label Extrapolation [29.598247232905283]
We present a novel, efficient method for data augmentation, effectively bridging the gap between existing augmentation strategies and emerging datasets and learning tasks.
Our findings highlight the potential to adapt existing augmentation pipelines for new data types and tasks, signaling a move towards more adaptable and resilient training frameworks.
arXiv Detail & Related papers (2024-10-03T14:21:49Z) - Data Augmentation for Image Classification using Generative AI [8.74488498507946]
Data augmentation is a promising solution to expanding the dataset size.
Recent approaches use generative AI models to improve dataset diversity.
We propose the Automated Generative Data Augmentation (AGA)
arXiv Detail & Related papers (2024-08-31T21:16:43Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Explicit and Implicit Knowledge Distillation via Unlabeled Data [5.702176304876537]
We propose an efficient unlabeled sample selection method to replace high computational generators.
We also propose a class-dropping mechanism to suppress the label noise caused by the data domain shifts.
Experimental results show that our method can quickly converge and obtain higher accuracy than other state-of-the-art methods.
arXiv Detail & Related papers (2023-02-17T09:10:41Z) - EquiMod: An Equivariance Module to Improve Self-Supervised Learning [77.34726150561087]
Self-supervised visual representation methods are closing the gap with supervised learning performance.
These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations.
We introduce EquiMod a generic equivariance module that structures the learned latent space.
arXiv Detail & Related papers (2022-11-02T16:25:54Z) - Adversarial Auto-Augment with Label Preservation: A Representation
Learning Principle Guided Approach [95.74102207187545]
We show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle.
We then propose a practical surrogate to the objective that can be efficiently optimized and integrated seamlessly into existing methods.
arXiv Detail & Related papers (2022-11-02T02:02:51Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning [27.871007011425775]
We propose a novel data augmentation method FlipDA that jointly uses a generative model and a classifier to generate label-flipped data.
Experiments show that FlipDA achieves a good tradeoff between effectiveness and robustness---it substantially improves many tasks while not negatively affecting the others.
arXiv Detail & Related papers (2021-08-13T17:51:31Z) - CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG
Signals [92.60744099084157]
We propose differentiable data augmentation amenable to gradient-based learning.
We demonstrate the relevance of our approach on the clinically relevant sleep staging classification task.
arXiv Detail & Related papers (2021-06-25T15:28:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.