Related papers: TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features

TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features

URL: http://arxiv.org/abs/2409.14500v2
Date: Thu, 26 Sep 2024 15:26:43 GMT
Title: TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features
Authors: Gleb Bazhenov, Oleg Platonov, Liudmila Prokhorenkova,
Abstract summary: Tabular machine learning may benefit from graph machine learning methods. graph neural networks (GNNs) can indeed often bring gains in predictive performance. Simple feature preprocessing enables them to compete with and even outperform GNNs.
Score: 17.277932238538302
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tabular machine learning is an important field for industry and science. In this field, table rows are usually treated as independent data samples, but additional information about relations between them is sometimes available and can be used to improve predictive performance. Such information can be naturally modeled with a graph, thus tabular machine learning may benefit from graph machine learning methods. However, graph machine learning models are typically evaluated on datasets with homogeneous node features, which have little in common with heterogeneous mixtures of numerical and categorical features present in tabular datasets. Thus, there is a critical difference between the data used in tabular and graph machine learning studies, which does not allow one to understand how successfully graph models can be transferred to tabular data. To bridge this gap, we propose a new benchmark of diverse graphs with heterogeneous tabular node features and realistic prediction tasks. We use this benchmark to evaluate a vast set of models, including simple methods previously overlooked in the literature. Our experiments show that graph neural networks (GNNs) can indeed often bring gains in predictive performance for tabular data, but standard tabular models also can be adapted to work with graph data by using simple feature preprocessing, which sometimes enables them to compete with and even outperform GNNs. Based on our empirical study, we provide insights for researchers and practitioners in both tabular and graph machine learning fields.

Related papers

Representation Learning for Tabular Data: A Comprehensive Survey [23.606506938919605]
Tabular data, structured as rows and columns, is among the most prevalent data types in machine learning classification and regression applications. Deep Neural Networks (DNNs) have recently demonstrated promising results through their capability of representation learning. We organize existing methods into three main categories according to their generalization capabilities.
arXiv Detail & Related papers (2025-04-17T17:58:23Z)
Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees [50.78679002846741]
We introduce a novel approach for learning cross-task generalities in graphs. We propose task-trees as basic learning instances to align task spaces on graphs. Our findings indicate that when a graph neural network is pretrained on diverse task-trees, it acquires transferable knowledge.
arXiv Detail & Related papers (2024-12-21T02:07:43Z)
GraphGLOW: Universal and Generalizable Structure Learning for Graph Neural Networks [72.01829954658889]
This paper introduces the mathematical definition of this novel problem setting. We devise a general framework that coordinates a single graph-shared structure learner and multiple graph-specific GNNs. The well-trained structure learner can directly produce adaptive structures for unseen target graphs without any fine-tuning.
arXiv Detail & Related papers (2023-06-20T03:33:22Z)
Between-Sample Relationship in Learning Tabular Data Using Graph and Attention Networks [0.0]
This paper relaxes the i.i.d assumption to learn tabular data representations by incorporating between-sample relationships. We investigate our hypothesis using several GNNs and state-of-the-art (SOTA) deep attention models. Our results reveal that attention-based GNN methods outperform traditional machine learning on five data sets and SOTA deep tabular learning methods on three data sets.
arXiv Detail & Related papers (2023-06-11T20:56:21Z)
Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions. By finding a mean in this embedding space, we can recover a mean graph that preserves structural information. We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z)
TabGSL: Graph Structure Learning for Tabular Data Prediction [10.66048003460524]
We present a novel solution, Tabular Graph Structure Learning (TabGSL), to enhance tabular data prediction. Experiments conducted on 30 benchmark datasets demonstrate that TabGSL markedly outperforms both tree-based models and recent deep learning-based models.
arXiv Detail & Related papers (2023-05-25T08:33:48Z)
Self-supervised Learning for Heterogeneous Graph via Structure Information based on Metapath [9.757299837675204]
Self-supervised representation learning is a potential approach to tackle this problem. In this paper, we propose a SElfsupervised learning method for heterogeneous graph via Structure Information based on Metapath. In order to predict jump number, SESIM uses data itself to generate labels, avoiding time-consuming manual labeling.
arXiv Detail & Related papers (2022-09-09T10:06:18Z)
Data augmentation on graphs for table type classification [1.1859913430860336]
We address the classification of tables using a Graph Neural Network, exploiting the table structure for the message passing algorithm in use. We achieve promising preliminary results, proposing a data augmentation method suitable for graph-based table representation.
arXiv Detail & Related papers (2022-08-23T21:54:46Z)
Similarity-aware Positive Instance Sampling for Graph Contrastive Pre-training [82.68805025636165]
We propose to select positive graph instances directly from existing graphs in the training set. Our selection is based on certain domain-specific pair-wise similarity measurements. Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph.
arXiv Detail & Related papers (2022-06-23T20:12:51Z)
A Robust Stacking Framework for Training Deep Graph Models with Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data. The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN. Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
Graph Kernel Neural Networks [53.91024360329517]
We propose to use graph kernels, i.e. kernel functions that compute an inner product on graphs, to extend the standard convolution operator to the graph domain. This allows us to define an entirely structural model that does not require computing the embedding of the input graph. Our architecture allows to plug-in any type of graph kernels and has the added benefit of providing some interpretability.
arXiv Detail & Related papers (2021-12-14T14:48:08Z)
A Tunable Model for Graph Generation Using LSTM and Conditional VAE [1.399948157377307]
We propose a generative model that can tune specific features, while learning structural features of a graph from data. With a dataset of graphs with various features generated by a model, we confirm that our model can generate a graph with specific features.
arXiv Detail & Related papers (2021-04-15T06:47:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.