Going beyond accuracy: estimating homophily in social networks using
predictions
- URL: http://arxiv.org/abs/2001.11171v1
- Date: Thu, 30 Jan 2020 04:37:12 GMT
- Title: Going beyond accuracy: estimating homophily in social networks using
predictions
- Authors: George Berry, Antonio Sirianni, Ingmar Weber, Jisun An, Michael Macy
- Abstract summary: In online social networks, it is common to use predictions of node categories to estimate measures of homophily.
We show that estimating homophily in a network can be viewed as a dyadic prediction problem.
We propose a novel "ego-alter" modeling approach that outperforms standard node and dyad classification strategies.
- Score: 5.135290600093722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In online social networks, it is common to use predictions of node categories
to estimate measures of homophily and other relational properties. However,
online social network data often lacks basic demographic information about the
nodes. Researchers must rely on predicted node attributes to estimate measures
of homophily, but little is known about the validity of these measures. We show
that estimating homophily in a network can be viewed as a dyadic prediction
problem, and that homophily estimates are unbiased when dyad-level residuals
sum to zero in the network. Node-level prediction models, such as the use of
names to classify ethnicity or gender, do not generally have this property and
can introduce large biases into homophily estimates. Bias occurs due to error
autocorrelation along dyads. Importantly, node-level classification performance
is not a reliable indicator of estimation accuracy for homophily. We compare
estimation strategies that make predictions at the node and dyad levels,
evaluating performance in different settings. We propose a novel "ego-alter"
modeling approach that outperforms standard node and dyad classification
strategies. While this paper focuses on homophily, results generalize to other
relational measures which aggregate predictions along the dyads in a network.
We conclude with suggestions for research designs to study homophily in online
networks. Code for this paper is available at
https://github.com/georgeberry/autocorr.
Related papers
- RoCP-GNN: Robust Conformal Prediction for Graph Neural Networks in Node-Classification [0.0]
Graph Neural Networks (GNNs) have emerged as powerful tools for predicting outcomes in graph-structured data.
One way to address this issue is by providing prediction sets that contain the true label with a predefined probability margin.
We propose a novel approach termed Robust Conformal Prediction for GNNs (RoCP-GNN)
Our approach robustly predicts outcomes with any predictive GNN model while quantifying the uncertainty in predictions within the realm of graph-based semi-supervised learning (SSL)
arXiv Detail & Related papers (2024-08-25T12:51:19Z) - Enhancing octree-based context models for point cloud geometry compression with attention-based child node number prediction [12.074555015414886]
In point cloud geometry compression, most octreebased context models use the cross-entropy between the onehot encoding of node occupancy and the probability distribution predicted by the context model as the loss.
We first analyze why the cross-entropy loss function fails to accurately measure the difference between the one-hot encoding and the predicted probability distribution.
We propose an attention-based child node number prediction (ACNP) module to enhance the context models.
arXiv Detail & Related papers (2024-07-11T14:16:41Z) - Generation is better than Modification: Combating High Class Homophily Variance in Graph Anomaly Detection [51.11833609431406]
Homophily distribution differences between different classes are significantly greater than those in homophilic and heterophilic graphs.
We introduce a new metric called Class Homophily Variance, which quantitatively describes this phenomenon.
To mitigate its impact, we propose a novel GNN model named Homophily Edge Generation Graph Neural Network (HedGe)
arXiv Detail & Related papers (2024-03-15T14:26:53Z) - Graph Out-of-Distribution Generalization via Causal Intervention [69.70137479660113]
We introduce a conceptually simple yet principled approach for training robust graph neural networks (GNNs) under node-level distribution shifts.
Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor.
Our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks.
arXiv Detail & Related papers (2024-02-18T07:49:22Z) - Uncertainty Quantification over Graph with Conformalized Graph Neural
Networks [52.20904874696597]
Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data.
GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant.
We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates.
arXiv Detail & Related papers (2023-05-23T21:38:23Z) - Distribution Free Prediction Sets for Node Classification [0.0]
We leverage recent advances in conformal prediction to construct prediction sets for node classification in inductive learning scenarios.
We show through experiments on standard benchmark datasets using popular GNN models that our approach provides tighter and better prediction sets than a naive application of conformal prediction.
arXiv Detail & Related papers (2022-11-26T12:54:45Z) - Dense Uncertainty Estimation [62.23555922631451]
In this paper, we investigate neural networks and uncertainty estimation techniques to achieve both accurate deterministic prediction and reliable uncertainty estimation.
We work on two types of uncertainty estimations solutions, namely ensemble based methods and generative model based methods, and explain their pros and cons while using them in fully/semi/weakly-supervised framework.
arXiv Detail & Related papers (2021-10-13T01:23:48Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Estimation with Uncertainty via Conditional Generative Adversarial
Networks [3.829070379776576]
We propose a predictive probabilistic neural network model, which corresponds to a different manner of using the generator in conditional Generative Adversarial Network (cGAN)
By reversing the input and output of ordinary cGAN, the model can be successfully used as a predictive model.
In addition, to measure the uncertainty of predictions, we introduce the entropy and relative entropy for regression problems and classification problems.
arXiv Detail & Related papers (2020-07-01T08:54:17Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Sampling Prediction-Matching Examples in Neural Networks: A
Probabilistic Programming Approach [9.978961706999833]
We consider the problem of exploring the prediction level sets of a classifier using probabilistic programming.
We define a prediction level set to be the set of examples for which the predictor has the same specified prediction confidence.
We demonstrate this technique with experiments on a synthetic dataset and MNIST.
arXiv Detail & Related papers (2020-01-09T15:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.