HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep
Neural Networks
- URL: http://arxiv.org/abs/2102.02515v1
- Date: Thu, 4 Feb 2021 10:00:13 GMT
- Title: HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep
Neural Networks
- Authors: Yuanyuan Chen, Boyang Li, Han Yu, Pengcheng Wu, Chunyan Miao
- Abstract summary: We propose Hypergradient Data Relevance Analysis, or HYDRA, which interprets predictions made by deep neural networks (DNNs) as effects of their training data.
HYDRA assesses the contribution of training data toward test data points throughout the training trajectory.
In addition, we quantitatively demonstrate that HYDRA outperforms influence functions in accurately estimating data contribution and detecting noisy data labels.
- Score: 51.143054943431665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The behaviors of deep neural networks (DNNs) are notoriously resistant to
human interpretations. In this paper, we propose Hypergradient Data Relevance
Analysis, or HYDRA, which interprets the predictions made by DNNs as effects of
their training data. Existing approaches generally estimate data contributions
around the final model parameters and ignore how the training data shape the
optimization trajectory. By unrolling the hypergradient of test loss w.r.t. the
weights of training data, HYDRA assesses the contribution of training data
toward test data points throughout the training trajectory. In order to
accelerate computation, we remove the Hessian from the calculation and prove
that, under moderate conditions, the approximation error is bounded.
Corroborating this theoretical claim, empirical results indicate the error is
indeed small. In addition, we quantitatively demonstrate that HYDRA outperforms
influence functions in accurately estimating data contribution and detecting
noisy data labels. The source code is available at
https://github.com/cyyever/aaai_hydra_8686.
Related papers
- An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning [5.2319020651074215]
We propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP)
Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution.
We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors.
arXiv Detail & Related papers (2023-02-25T08:16:21Z) - Dataset Distillation: A Comprehensive Review [76.26276286545284]
dataset distillation (DD) aims to derive a much smaller dataset containing synthetic samples, based on which the trained models yield performance comparable with those trained on the original dataset.
This paper gives a comprehensive review and summary of recent advances in DD and its application.
arXiv Detail & Related papers (2023-01-17T17:03:28Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - Influence-guided Data Augmentation for Neural Tensor Completion [21.625908410873944]
We propose DAIN, a general data augmentation framework that enhances the prediction accuracy of neural tensor completion methods.
In this paper, we show that DAIN outperforms all data augmentation baselines in terms of enhancing imputation accuracy of neural tensor completion.
arXiv Detail & Related papers (2021-08-23T15:38:59Z) - A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs [11.152761263415046]
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs)
We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures.
arXiv Detail & Related papers (2021-05-05T05:14:08Z) - Artificial Neural Networks to Impute Rounded Zeros in Compositional Data [0.0]
Methods of deep learning have become increasingly popular in recent years, but they have not arrived in compositional data analysis.
This paper shows a new method for imputing rounded zeros based on artificial neural networks.
It can be shown, that ANNs are competitive or even performing better when imputing rounded zeros of data sets with moderate size.
arXiv Detail & Related papers (2020-12-18T15:31:23Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.