Related papers: Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks

Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks

URL: http://arxiv.org/abs/2310.00946v1
Date: Mon, 2 Oct 2023 07:37:28 GMT
Title: Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks
Authors: Andreas Roth, Thomas Liebig
Abstract summary: Models with similar performances exhibit significant disagreement in the predictions of individual samples, referred to as prediction churn. We propose a novel metric called Influence Difference (ID) to quantify the variation in reasons used by nodes across models. We also consider the differences between nodes with a stable and an unstable prediction, positing that both equally utilize different reasons. As an efficient approximation, we introduce DropDistillation (DD) that matches the output for a graph perturbed by edge deletions.
Score: 4.213427823201119
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Models with similar performances exhibit significant disagreement in the predictions of individual samples, referred to as prediction churn. Our work explores this phenomenon in graph neural networks by investigating differences between models differing only in their initializations in their utilized features for predictions. We propose a novel metric called Influence Difference (ID) to quantify the variation in reasons used by nodes across models by comparing their influence distribution. Additionally, we consider the differences between nodes with a stable and an unstable prediction, positing that both equally utilize different reasons and thus provide a meaningful gradient signal to closely match two models even when the predictions for nodes are similar. Based on our analysis, we propose to minimize this ID in Knowledge Distillation, a domain where a new model should closely match an established one. As an efficient approximation, we introduce DropDistillation (DD) that matches the output for a graph perturbed by edge deletions. Our empirical evaluation of six benchmark datasets for node classification validates the differences in utilized features. DD outperforms previous methods regarding prediction stability and overall performance in all considered Knowledge Distillation experiments.

Related papers

Sub-graph Based Diffusion Model for Link Prediction [43.15741675617231]
Denoising Diffusion Probabilistic Models (DDPMs) represent a contemporary class of generative models with exceptional qualities. We build a novel generative model for link prediction using a dedicated design to decompose the likelihood estimation process via the Bayesian formula. Our proposed method presents numerous advantages: (1) transferability across datasets without retraining, (2) promising generalization on limited training data, and (3) robustness against graph adversarial attacks.
arXiv Detail & Related papers (2024-09-13T02:23:55Z)
Generation is better than Modification: Combating High Class Homophily Variance in Graph Anomaly Detection [51.11833609431406]
Homophily distribution differences between different classes are significantly greater than those in homophilic and heterophilic graphs. We introduce a new metric called Class Homophily Variance, which quantitatively describes this phenomenon. To mitigate its impact, we propose a novel GNN model named Homophily Edge Generation Graph Neural Network (HedGe)
arXiv Detail & Related papers (2024-03-15T14:26:53Z)
Confidence and Dispersity Speak: Characterising Prediction Matrix for Unsupervised Accuracy Estimation [51.809741427975105]
This work aims to assess how well a model performs under distribution shifts without using labels. We use the nuclear norm that has been shown to be effective in characterizing both properties. We show that the nuclear norm is more accurate and robust in accuracy than existing methods.
arXiv Detail & Related papers (2023-02-02T13:30:48Z)
Do Bayesian Variational Autoencoders Know What They Don't Know? [0.6091702876917279]
The problem of detecting the Out-of-Distribution (OoD) inputs is paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable. This paper investigates three approaches to inference: Markov chain Monte Carlo, Bayes gradient by Backpropagation and Weight Averaging-Gaussian.
arXiv Detail & Related papers (2022-12-29T11:48:01Z)
On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data. Invariance measures consistency of model predictions on transformations of the data. From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z)
On the Prediction Instability of Graph Neural Networks [2.3605348648054463]
Instability of trained models can affect reliability, reliability, and trust in machine learning systems. We systematically assess the prediction instability of node classification with state-of-the-art Graph Neural Networks (GNNs) We find that up to one third of the incorrectly classified nodes differ across algorithm runs.
arXiv Detail & Related papers (2022-05-20T10:32:59Z)
Heterogeneous Peer Effects in the Linear Threshold Model [13.452510519858995]
The Linear Threshold Model describes how information diffuses through a social network. We propose causal inference methods for estimating individual thresholds that can more accurately predict whether and when individuals will be affected by their peers. Our experimental results on synthetic and real-world datasets show that our proposed models can better predict individual-level thresholds in the Linear Threshold Model.
arXiv Detail & Related papers (2022-01-27T00:23:26Z)
Deconfounding to Explanation Evaluation in Graph Neural Networks [136.73451468551656]
We argue that a distribution shift exists between the full graph and the subgraph, causing the out-of-distribution problem. We propose Deconfounded Subgraph Evaluation (DSE) which assesses the causal effect of an explanatory subgraph on the model prediction.
arXiv Detail & Related papers (2022-01-21T18:05:00Z)
Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions. We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts. We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z)
Toward Scalable and Unified Example-based Explanation and Outlier Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction. We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.