Handling Missing Data with Graph Representation Learning
- URL: http://arxiv.org/abs/2010.16418v1
- Date: Fri, 30 Oct 2020 17:59:13 GMT
- Title: Handling Missing Data with Graph Representation Learning
- Authors: Jiaxuan You, Xiaobai Ma, Daisy Yi Ding, Mykel Kochenderfer, Jure
Leskovec
- Abstract summary: We propose GRAPE, a graph-based framework for feature imputation as well as label prediction.
Under GRAPE, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task.
Experimental results on nine benchmark datasets show that GRAPE yields 20% lower mean absolute error for imputation tasks and 10% lower for label prediction tasks.
- Score: 62.59831675688714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning with missing data has been approached in two different ways,
including feature imputation where missing feature values are estimated based
on observed values, and label prediction where downstream labels are learned
directly from incomplete data. However, existing imputation models tend to have
strong prior assumptions and cannot learn from downstream tasks, while models
targeting label prediction often involve heuristics and can encounter
scalability issues. Here we propose GRAPE, a graph-based framework for feature
imputation as well as label prediction. GRAPE tackles the missing data problem
using a graph representation, where the observations and features are viewed as
two types of nodes in a bipartite graph, and the observed feature values as
edges. Under the GRAPE framework, the feature imputation is formulated as an
edge-level prediction task and the label prediction as a node-level prediction
task. These tasks are then solved with Graph Neural Networks. Experimental
results on nine benchmark datasets show that GRAPE yields 20% lower mean
absolute error for imputation tasks and 10% lower for label prediction tasks,
compared with existing state-of-the-art methods.
Related papers
- Mitigating Label Noise on Graph via Topological Sample Selection [72.86862597508077]
We propose a $textitTopological Sample Selection$ (TSS) method that boosts the informative sample selection process in a graph by utilising topological information.
We theoretically prove that our procedure minimizes an upper bound of the expected risk under target clean distribution, and experimentally show the superiority of our method compared with state-of-the-art baselines.
arXiv Detail & Related papers (2024-03-04T11:24:51Z) - Towards Self-Interpretable Graph-Level Anomaly Detection [73.1152604947837]
Graph-level anomaly detection (GLAD) aims to identify graphs that exhibit notable dissimilarity compared to the majority in a collection.
We propose a Self-Interpretable Graph aNomaly dETection model ( SIGNET) that detects anomalous graphs as well as generates informative explanations simultaneously.
arXiv Detail & Related papers (2023-10-25T10:10:07Z) - Towards Semi-supervised Universal Graph Classification [6.339931887475018]
We study the problem of semi-supervised universal graph classification.
This problem is challenging due to a severe lack of labels and potential class shifts.
We propose a novel graph neural network framework named UGNN, which makes the best of unlabeled data from the subgraph perspective.
arXiv Detail & Related papers (2023-05-31T06:58:34Z) - Semi-Supervised Graph Imbalanced Regression [17.733488328772943]
We propose a semi-supervised framework to progressively balance training data and reduce model bias via self-training.
Results demonstrate that the proposed framework significantly reduces the error of predicted graph properties.
arXiv Detail & Related papers (2023-05-20T04:11:00Z) - Data-Centric Learning from Unlabeled Graphs with Diffusion Model [21.417410006246147]
We propose to extract the knowledge underlying the large set of unlabeled graphs as a specific set of useful data points.
We use a diffusion model to fully utilize the unlabeled graphs and design two new objectives to guide the model's denoising process.
Experiments demonstrate that our data-centric approach performs significantly better than fifteen existing various methods on fifteen tasks.
arXiv Detail & Related papers (2023-03-17T16:39:21Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Graph Self-supervised Learning with Accurate Discrepancy Learning [64.69095775258164]
We propose a framework that aims to learn the exact discrepancy between the original and the perturbed graphs, coined as Discrepancy-based Self-supervised LeArning (D-SLA)
We validate our method on various graph-related downstream tasks, including molecular property prediction, protein function prediction, and link prediction tasks, on which our model largely outperforms relevant baselines.
arXiv Detail & Related papers (2022-02-07T08:04:59Z) - Node Classification Meets Link Prediction on Knowledge Graphs [16.37145148171519]
We study the problems of transductive node classification over incomplete graphs and link prediction over graphs with node features.
Our model performs very strongly when compared to the respective state-of-the-art models for node classification and link prediction.
arXiv Detail & Related papers (2021-06-14T10:52:52Z) - Line Graph Neural Networks for Link Prediction [71.00689542259052]
We consider the graph link prediction task, which is a classic graph analytical problem with many real-world applications.
In this formalism, a link prediction problem is converted to a graph classification task.
We propose to seek a radically different and novel path by making use of the line graphs in graph theory.
In particular, each node in a line graph corresponds to a unique edge in the original graph. Therefore, link prediction problems in the original graph can be equivalently solved as a node classification problem in its corresponding line graph, instead of a graph classification task.
arXiv Detail & Related papers (2020-10-20T05:54:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.