Revisiting the Necessity of Graph Learning and Common Graph Benchmarks
- URL: http://arxiv.org/abs/2412.06173v1
- Date: Mon, 09 Dec 2024 03:09:04 GMT
- Title: Revisiting the Necessity of Graph Learning and Common Graph Benchmarks
- Authors: Isay Katsman, Ethan Lou, Anna Gilbert,
- Abstract summary: Graph machine learning has enjoyed a meteoric rise in popularity since the introduction of deep learning in graph contexts.
The driving belief is that node features are insufficient for these tasks, so benchmark performance accurately reflects improvements in graph learning.
We show that surprisingly, node features are oftentimes more-than-sufficient for these tasks.
- Score: 2.1125997983972207
- License:
- Abstract: Graph machine learning has enjoyed a meteoric rise in popularity since the introduction of deep learning in graph contexts. This is no surprise due to the ubiquity of graph data in large scale industrial settings. Tacitly assumed in all graph learning tasks is the separation of the graph structure and node features: node features strictly encode individual data while the graph structure consists only of pairwise interactions. The driving belief is that node features are (by themselves) insufficient for these tasks, so benchmark performance accurately reflects improvements in graph learning. In our paper, we challenge this orthodoxy by showing that, surprisingly, node features are oftentimes more-than-sufficient for many common graph benchmarks, breaking this critical assumption. When comparing against a well-tuned feature-only MLP baseline on seven of the most commonly used graph learning datasets, one gains little benefit from using graph structure on five datasets. We posit that these datasets do not benefit considerably from graph learning because the features themselves already contain enough graph information to obviate or substantially reduce the need for the graph. To illustrate this point, we perform a feature study on these datasets and show how the features are responsible for closing the gap between MLP and graph-method performance. Further, in service of introducing better empirical measures of progress for graph neural networks, we present a challenging parametric family of principled synthetic datasets that necessitate graph information for nontrivial performance. Lastly, we section out a subset of real-world datasets that are not trivially solved by an MLP and hence serve as reasonable benchmarks for graph neural networks.
Related papers
- The Graph Lottery Ticket Hypothesis: Finding Sparse, Informative Graph
Structure [18.00833762891405]
Graph Lottery Ticket (GLT) Hypothesis: There is an extremely sparse backbone for every graph.
We study 8 key metrics of interest that directly influence the performance of graph learning algorithms.
We propose a straightforward and efficient algorithm for finding these GLTs in arbitrary graphs.
arXiv Detail & Related papers (2023-12-08T00:24:44Z) - Structure-free Graph Condensation: From Large-scale Graphs to Condensed
Graph-free Data [91.27527985415007]
Existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph.
We advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a large-scale graph into a small-scale graph node set.
arXiv Detail & Related papers (2023-06-05T07:53:52Z) - There is more to graphs than meets the eye: Learning universal features with self-supervision [2.882036130110936]
We study the problem of learning features through self-supervision that are generalisable to multiple graphs.
Our approach results in (1) better performance on downstream node classification, (2) learning features that can be re-used for unseen graphs of the same family, (3) more efficient training and (4) compact yet generalisable models.
arXiv Detail & Related papers (2023-05-31T14:08:48Z) - State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more.
Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures.
Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z) - Synthetic Graph Generation to Benchmark Graph Learning [7.914804101579097]
Graph learning algorithms have attained state-of-the-art performance on many graph analysis tasks.
One reason is due to the very small number of datasets used in practice to benchmark the performance of graph learning algorithms.
We propose to generate synthetic graphs, and study the behaviour of graph learning algorithms in a controlled scenario.
arXiv Detail & Related papers (2022-04-04T10:48:32Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Edge but not Least: Cross-View Graph Pooling [76.71497833616024]
This paper presents a cross-view graph pooling (Co-Pooling) method to better exploit crucial graph structure information.
Through cross-view interaction, edge-view pooling and node-view pooling seamlessly reinforce each other to learn more informative graph-level representations.
arXiv Detail & Related papers (2021-09-24T08:01:23Z) - Co-embedding of Nodes and Edges with Graph Neural Networks [13.020745622327894]
Graph embedding is a way to transform and encode the data structure in high dimensional and non-Euclidean feature space.
CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space.
Our approach achieves or matches the state-of-the-art performance in four graph learning tasks.
arXiv Detail & Related papers (2020-10-25T22:39:31Z) - Multilevel Graph Matching Networks for Deep Graph Similarity Learning [79.3213351477689]
We propose a multi-level graph matching network (MGMN) framework for computing the graph similarity between any pair of graph-structured objects.
To compensate for the lack of standard benchmark datasets, we have created and collected a set of datasets for both the graph-graph classification and graph-graph regression tasks.
Comprehensive experiments demonstrate that MGMN consistently outperforms state-of-the-art baseline models on both the graph-graph classification and graph-graph regression tasks.
arXiv Detail & Related papers (2020-07-08T19:48:19Z) - Graph Pooling with Node Proximity for Hierarchical Representation
Learning [80.62181998314547]
We propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
Results show that the proposed graph pooling strategy is able to achieve state-of-the-art performance on a collection of public graph classification benchmark datasets.
arXiv Detail & Related papers (2020-06-19T13:09:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.