Fast Evaluation of DNN for Past Dataset in Incremental Learning
- URL: http://arxiv.org/abs/2405.06296v1
- Date: Fri, 10 May 2024 07:55:08 GMT
- Title: Fast Evaluation of DNN for Past Dataset in Incremental Learning
- Authors: Naoto Sato,
- Abstract summary: We propose a new method to quickly evaluate the effect on the accuracy for the past dataset.
The gradient of the parameter values for the past dataset is extracted by running the DNN before the training.
The proposed method can estimate the accuracy change by additional training in a constant time.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: During the operation of a system including a deep neural network (DNN), new input values that were not included in the training dataset are given to the DNN. In such a case, the DNN may be incrementally trained with the new input values; however, that training may reduce the accuracy of the DNN in regard to the dataset that was previously obtained and used for the past training. It is necessary to evaluate the effect of the additional training on the accuracy for the past dataset. However, evaluation by testing all the input values included in the past dataset takes time. Therefore, we propose a new method to quickly evaluate the effect on the accuracy for the past dataset. In the proposed method, the gradient of the parameter values (such as weight and bias) for the past dataset is extracted by running the DNN before the training. Then, after the training, its effect on the accuracy with respect to the past dataset is calculated from the gradient and update differences of the parameter values. To show the usefulness of the proposed method, we present experimental results with several datasets. The results show that the proposed method can estimate the accuracy change by additional training in a constant time.
Related papers
- KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training [2.8804804517897935]
We propose a method for hiding the least-important samples during the training of deep neural networks.
We adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process.
Our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.
arXiv Detail & Related papers (2023-10-16T06:19:29Z) - LAVA: Data Valuation without Pre-Specified Learning Algorithms [20.578106028270607]
We introduce a new framework that can value training data in a way that is oblivious to the downstream learning algorithm.
We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets.
We show that the distance characterizes the upper bound of the validation performance for any given model under certain Lipschitz conditions.
arXiv Detail & Related papers (2023-04-28T19:05:16Z) - Optimizing Data Shapley Interaction Calculation from O(2^n) to O(t n^2)
for KNN models [2.365702128814616]
"STI-KNN" is an innovative algorithm that calculates the exact pair-interaction Shapley values for KNN models in O(t n2) time.
By using STI-KNN, we can efficiently and accurately evaluate the value of individual data points, leading to improved training outcomes and ultimately enhancing the effectiveness of artificial intelligence applications.
arXiv Detail & Related papers (2023-04-02T06:15:19Z) - Confidence-Nets: A Step Towards better Prediction Intervals for
regression Neural Networks on small datasets [0.0]
We propose an ensemble method that attempts to estimate the uncertainty of predictions, increase their accuracy and provide an interval for the expected variation.
The proposed method is tested on various datasets, and a significant improvement in the performance of the neural network model is seen.
arXiv Detail & Related papers (2022-10-31T06:38:40Z) - Dataset Distillation by Matching Training Trajectories [75.9031209877651]
We propose a new formulation that optimize our distilled data to guide networks to a similar state as those trained on real data.
Given a network, we train it for several iterations on our distilled data and optimize the distilled data with respect to the distance between the synthetically trained parameters and the parameters trained on real data.
Our method handily outperforms existing methods and also allows us to distill higher-resolution visual data.
arXiv Detail & Related papers (2022-03-22T17:58:59Z) - Learning to be a Statistician: Learned Estimator for Number of Distinct
Values [54.629042119819744]
Estimating the number of distinct values (NDV) in a column is useful for many tasks in database systems.
In this work, we focus on how to derive accurate NDV estimations from random (online/offline) samples.
We propose to formulate the NDV estimation task in a supervised learning framework, and aim to learn a model as the estimator.
arXiv Detail & Related papers (2022-02-06T15:42:04Z) - DIVA: Dataset Derivative of a Learning Task [108.18912044384213]
We present a method to compute the derivative of a learning task with respect to a dataset.
A learning task is a function from a training set to the validation error, which can be represented by a trained deep neural network (DNN)
The "dataset derivative" is a linear operator, computed around the trained model, that informs how outliers of the weight of each training sample affect the validation error.
arXiv Detail & Related papers (2021-11-18T16:33:12Z) - IADA: Iterative Adversarial Data Augmentation Using Formal Verification
and Expert Guidance [1.599072005190786]
We propose an iterative adversarial data augmentation framework to learn neural network models.
The proposed framework is applied to an artificial 2D dataset, the MNIST dataset, and a human motion dataset.
We show that our training method can improve the robustness and accuracy of the learned model.
arXiv Detail & Related papers (2021-08-16T03:05:53Z) - HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep
Neural Networks [51.143054943431665]
We propose Hypergradient Data Relevance Analysis, or HYDRA, which interprets predictions made by deep neural networks (DNNs) as effects of their training data.
HYDRA assesses the contribution of training data toward test data points throughout the training trajectory.
In addition, we quantitatively demonstrate that HYDRA outperforms influence functions in accurately estimating data contribution and detecting noisy data labels.
arXiv Detail & Related papers (2021-02-04T10:00:13Z) - Predicting Training Time Without Training [120.92623395389255]
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.
We leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model.
We are able to predict the time it takes to fine-tune a model to a given loss without having to perform any training.
arXiv Detail & Related papers (2020-08-28T04:29:54Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.