Data Pruning via Moving-one-Sample-out
- URL: http://arxiv.org/abs/2310.14664v2
- Date: Wed, 25 Oct 2023 06:19:05 GMT
- Title: Data Pruning via Moving-one-Sample-out
- Authors: Haoru Tan, Sitong Wu, Fei Du, Yukang Chen, Zhibin Wang, Fan Wang,
Xiaojuan Qi
- Abstract summary: We propose a novel data-pruning approach called moving-one-sample-out (MoSo)
MoSo aims to identify and remove the least informative samples from the training set.
Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
- Score: 61.45441981346064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a novel data-pruning approach called
moving-one-sample-out (MoSo), which aims to identify and remove the least
informative samples from the training set. The core insight behind MoSo is to
determine the importance of each sample by assessing its impact on the optimal
empirical risk. This is achieved by measuring the extent to which the empirical
risk changes when a particular sample is excluded from the training set.
Instead of using the computationally expensive leaving-one-out-retraining
procedure, we propose an efficient first-order approximator that only requires
gradient information from different training stages. The key idea behind our
approximation is that samples with gradients that are consistently aligned with
the average gradient of the training set are more informative and should
receive higher scores, which could be intuitively understood as follows: if the
gradient from a specific sample is consistent with the average gradient vector,
it implies that optimizing the network using the sample will yield a similar
effect on all remaining samples. Experimental results demonstrate that MoSo
effectively mitigates severe performance degradation at high pruning ratios and
achieves satisfactory performance across various settings.
Related papers
- Dataset Quantization with Active Learning based Adaptive Sampling [11.157462442942775]
We show that maintaining performance is feasible even with uneven sample distributions.
We propose a novel active learning based adaptive sampling strategy to optimize the sample selection.
Our approach outperforms the state-of-the-art dataset compression methods.
arXiv Detail & Related papers (2024-07-09T23:09:18Z) - Importance Sampling for Stochastic Gradient Descent in Deep Neural
Networks [0.0]
Importance sampling for training deep neural networks has been widely studied.
This paper reviews the challenges inherent to this research area.
We propose a metric allowing the assessment of the quality of a given sampling scheme.
arXiv Detail & Related papers (2023-03-29T08:35:11Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution
Samples [19.311470287767385]
We propose to use out-of-distribution samples, i.e., unlabeled samples coming from outside the target classes, to improve few-shot learning.
Our approach is simple to implement, agnostic to feature extractors, lightweight without any additional cost for pre-training, and applicable to both inductive and transductive settings.
arXiv Detail & Related papers (2022-06-08T18:59:21Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Reweighting Augmented Samples by Minimizing the Maximal Expected Loss [51.2791895511333]
We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples.
Inspired by adversarial training, we minimize this maximal expected loss and obtain a simple and interpretable closed-form solution.
The proposed method can generally be applied on top of any data augmentation methods.
arXiv Detail & Related papers (2021-03-16T09:31:04Z) - Optimal Importance Sampling for Federated Learning [57.14673504239551]
Federated learning involves a mixture of centralized and decentralized processing tasks.
The sampling of both agents and data is generally uniform; however, in this work we consider non-uniform sampling.
We derive optimal importance sampling strategies for both agent and data selection and show that non-uniform sampling without replacement improves the performance of the original FedAvg algorithm.
arXiv Detail & Related papers (2020-10-26T14:15:33Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.