GiGL: Large-Scale Graph Neural Networks at Snapchat
- URL: http://arxiv.org/abs/2502.15054v1
- Date: Thu, 20 Feb 2025 21:29:17 GMT
- Title: GiGL: Large-Scale Graph Neural Networks at Snapchat
- Authors: Tong Zhao, Yozen Liu, Matthew Kolodner, Kyle Montemayor, Elham Ghazizadeh, Ankit Batra, Zihao Fan, Xiaobin Gao, Xuan Guo, Jiwen Ren, Serim Park, Peicheng Yu, Jun Yu, Shubham Vij, Neil Shah,
- Abstract summary: We present GiGL (Gigantic Graph Learning), an open-source library to enable large-scale distributed graph ML.<n>We use GiGL internally at Snapchat to manage the heavy lifting of GNN, including graph data preprocessing from relational DBs.<n>GiGL is used in multiple production settings, and has powered over 35 launches across multiple business domains in the last 2 years.
- Score: 32.1186726452899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in graph machine learning (ML) with the introduction of Graph Neural Networks (GNNs) have led to a widespread interest in applying these approaches to business applications at scale. GNNs enable differentiable end-to-end (E2E) learning of model parameters given graph structure which enables optimization towards popular node, edge (link) and graph-level tasks. While the research innovation in new GNN layers and training strategies has been rapid, industrial adoption and utility of GNNs has lagged considerably due to the unique scale challenges that large-scale graph ML problems create. In this work, we share our approach to training, inference, and utilization of GNNs at Snapchat. To this end, we present GiGL (Gigantic Graph Learning), an open-source library to enable large-scale distributed graph ML to the benefit of researchers, ML engineers, and practitioners. We use GiGL internally at Snapchat to manage the heavy lifting of GNN workflows, including graph data preprocessing from relational DBs, subgraph sampling, distributed training, inference, and orchestration. GiGL is designed to interface cleanly with open-source GNN modeling libraries prominent in academia like PyTorch Geometric (PyG), while handling scaling and productionization challenges that make it easier for internal practitioners to focus on modeling. GiGL is used in multiple production settings, and has powered over 35 launches across multiple business domains in the last 2 years in the contexts of friend recommendation, content recommendation and advertising. This work details high-level design and tools the library provides, scaling properties, case studies in diverse business settings with industry-scale graphs, and several key lessons learned in employing graph ML at scale on large social data. GiGL is open-sourced at https://github.com/snap-research/GiGL.
Related papers
- Large Language Models Meet Graph Neural Networks: A Perspective of Graph Mining [2.8843412675137343]
We present a novel taxonomy for research in this interdisciplinary field, which involves three main categories: GNN-driving-LLM, GNN-driving-GNN, and GNN-LLM-co-driving.<n>Although Large Language Models have demonstrated their great potential in handling graph-structured data, their high computational requirements and complexity remain challenges.
arXiv Detail & Related papers (2024-12-26T13:21:09Z) - All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data.
E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z) - GraphStorm: all-in-one graph machine learning framework for industry applications [75.23076561638348]
GraphStorm is an end-to-end solution for scalable graph construction, graph model training and inference.
Every component in GraphStorm can operate on graphs with billions of nodes and can scale model training and inference to different hardware without changing any code.
GraphStorm has been used and deployed for over a dozen billion-scale industry applications after its release in May 2023.
arXiv Detail & Related papers (2024-06-10T04:56:16Z) - GLISP: A Scalable GNN Learning System by Exploiting Inherent Structural
Properties of Graphs [5.410321469222541]
We propose GLISP, a sampling based GNN learning system for industrial scale graphs.
GLISP consists of three core components: graph partitioner, graph sampling service and graph inference engine.
Experiments show that GLISP achieves up to $6.53times$ and $70.77times$ speedups over existing GNN systems for training and inference tasks.
arXiv Detail & Related papers (2024-01-06T02:59:24Z) - LasTGL: An Industrial Framework for Large-Scale Temporal Graph Learning [61.4707298969173]
We introduce LasTGL, an industrial framework that integrates unified and unified implementations of common temporal graph learning algorithms.
LasTGL provides comprehensive temporal graph datasets, TGNN models and utilities along with well-documented tutorials.
arXiv Detail & Related papers (2023-11-28T08:45:37Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Graph Ladling: Shockingly Simple Parallel GNN Training without
Intermediate Communication [100.51884192970499]
GNNs are a powerful family of neural networks for learning over graphs.
scaling GNNs either by deepening or widening suffers from prevalent issues of unhealthy gradients, over-smoothening, information squashing.
We propose not to deepen or widen current GNNs, but instead present a data-centric perspective of model soups tailored for GNNs.
arXiv Detail & Related papers (2023-06-18T03:33:46Z) - CogDL: A Comprehensive Library for Graph Deep Learning [55.694091294633054]
We present CogDL, a library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency.
In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries.
We develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency.
arXiv Detail & Related papers (2021-03-01T12:35:16Z) - Analyzing the Performance of Graph Neural Networks with Pipe Parallelism [2.269587850533721]
We focus on Graph Neural Networks (GNNs) that have found great success in tasks such as node or edge classification and link prediction.
New approaches for processing larger networks are needed to advance graph techniques.
We study how GNNs could be parallelized using existing tools and frameworks that are known to be successful in the deep learning community.
arXiv Detail & Related papers (2020-12-20T04:20:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.