Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning
- URL: http://arxiv.org/abs/2508.06588v2
- Date: Fri, 26 Sep 2025 10:31:45 GMT
- Title: Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning
- Authors: Zian Zhai, Fan Li, Xingyu Tan, Xiaoyang Wang, Wenjie Zhang,
- Abstract summary: We present the first empirical study showing that codebook collapse consistently occurs when applying Vector Quantization to graph data.<n>We propose RGVQ, a novel framework that integrates graph topology and feature similarity as explicit regularization signals to enhance codebook utilization and promote token diversity.
- Score: 12.232364178523822
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Vector Quantization (VQ) has recently emerged as a promising approach for learning discrete representations of graph-structured data. However, a fundamental challenge, i.e., codebook collapse, remains underexplored in the graph domain, significantly limiting the expressiveness and generalization of graph tokens.In this paper, we present the first empirical study showing that codebook collapse consistently occurs when applying VQ to graph data, even with mitigation strategies proposed in vision or language domains. To understand why graph VQ is particularly vulnerable to collapse, we provide a theoretical analysis and identify two key factors: early assignment imbalances caused by redundancy in graph features and structural patterns, and self-reinforcing optimization loops in deterministic VQ. To address these issues, we propose RGVQ, a novel framework that integrates graph topology and feature similarity as explicit regularization signals to enhance codebook utilization and promote token diversity. RGVQ introduces soft assignments via Gumbel-Softmax reparameterization, ensuring that all codewords receive gradient updates. In addition, RGVQ incorporates a structure-aware contrastive regularization to penalize the token co-assignments among dissimilar node pairs. Extensive experiments demonstrate that RGVQ substantially improves codebook utilization and consistently boosts the performance of state-of-the-art graph VQ backbones across multiple downstream tasks, enabling more expressive and transferable graph token representations.
Related papers
- ProGraph-R1: Progress-aware Reinforcement Learning for Graph Retrieval Augmented Generation [37.11787010202267]
We propose ProGraph-R1, a progress-aware agentic framework for graph-based retrieval and multi-step reasoning.<n>ProGraph-R1 introduces a structure-aware hypergraph retrieval mechanism that jointly considers semantic relevance and graph connectivity.<n> Experiments on multi-hop question answering benchmarks demonstrate that ProGraph-R1 consistently improves reasoning accuracy and generation quality over existing GraphRAG methods.
arXiv Detail & Related papers (2026-01-25T08:58:44Z) - UniGTE: Unified Graph-Text Encoding for Zero-Shot Generalization across Graph Tasks and Domains [12.05107789697386]
We introduce UniGTE, an instruction-tuned encoder-decoder framework that unifies structural and semantic reasoning.<n>UniGTE is instruction-tuned on five datasets spanning node-level, edge-level, and graph-level tasks across diverse domains.<n>It achieves new state-of-the-art zero-shot results on node classification, link prediction, graph classification, and graph regression under cross-task and cross-domain settings.
arXiv Detail & Related papers (2025-10-19T15:36:45Z) - Youtu-GraphRAG: Vertically Unified Agents for Graph Retrieval-Augmented Complex Reasoning [32.78218766121055]
Graph retrieval-augmented generation (GraphRAG) has effectively enhanced large language models in complex reasoning.<n>We propose a vertically unified agentic paradigm, Youtu-GraphRAG, to jointly connect the entire framework as an intricate integration.
arXiv Detail & Related papers (2025-08-27T13:13:20Z) - GLANCE: Graph Logic Attention Network with Cluster Enhancement for Heterophilous Graph Representation Learning [54.60090631330295]
Graph Neural Networks (GNNs) have demonstrated significant success in learning from graph-structured data but often struggle on heterophilous graphs.<n>We propose GLANCE, a novel framework that integrates logic-guided reasoning, dynamic graph refinement, and adaptive clustering to enhance graph representation learning.
arXiv Detail & Related papers (2025-07-24T15:45:26Z) - Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [75.9865035064794]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z) - Hierarchical Vector Quantized Graph Autoencoder with Annealing-Based Code Selection [13.731120424653705]
The Vector Quantized Variational Autoencoder (VQ-VAE) is a powerful autoencoder extensively used in fields such as computer vision.<n>In this paper, we provide an empirical analysis of vector quantization in the context of graph autoencoders.<n>We identify two key challenges associated with vector quantization when applying in graph data: codebook underutilization and codebook space sparsity.
arXiv Detail & Related papers (2025-04-17T07:43:52Z) - Beyond Message Passing: Neural Graph Pattern Machine [50.78679002846741]
We introduce the Neural Graph Pattern Machine (GPM), a novel framework that bypasses message passing by learning directly from graph substructures.<n>GPM efficiently extracts, encodes, and prioritizes task-relevant graph patterns, offering greater expressivity and improved ability to capture long-range dependencies.
arXiv Detail & Related papers (2025-01-30T20:37:47Z) - GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction [6.817416560637197]
Graph autoencoders (GAEs) reconstruct graph structures from node embeddings.
We introduce a cross-correlation mechanism that significantly enhances the GAE representational capabilities.
We also propose GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks.
arXiv Detail & Related papers (2024-10-04T12:59:45Z) - A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks.
This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z) - Isomorphic-Consistent Variational Graph Auto-Encoders for Multi-Level
Graph Representation Learning [9.039193854524763]
We propose the Isomorphic-Consistent VGAE (IsoC-VGAE) for task-agnostic graph representation learning.
We first devise a decoding scheme to provide a theoretical guarantee of keeping the isomorphic consistency.
We then propose the Inverse Graph Neural Network (Inv-GNN) decoder as its intuitive realization.
arXiv Detail & Related papers (2023-12-09T10:16:53Z) - Let There Be Order: Rethinking Ordering in Autoregressive Graph
Generation [6.422073551199993]
Conditional graph generation tasks involve training a model to generate a graph given a set of input conditions.
Many previous studies employ autoregressive models to incrementally generate graph components such as nodes and edges.
As graphs typically lack a natural ordering among their components, converting a graph into a sequence of tokens is not straightforward.
arXiv Detail & Related papers (2023-05-24T20:52:34Z) - Spectral Augmentations for Graph Contrastive Learning [50.149996923976836]
Contrastive learning has emerged as a premier method for learning representations with or without supervision.
Recent studies have shown its utility in graph representation learning for pre-training.
We propose a set of well-motivated graph transformation operations to provide a bank of candidates when constructing augmentations for a graph contrastive objective.
arXiv Detail & Related papers (2023-02-06T16:26:29Z) - Let Invariant Rationale Discovery Inspire Graph Contrastive Learning [98.10268114789775]
We argue that a high-performing augmentation should preserve the salient semantics of anchor graphs regarding instance-discrimination.
We propose a new framework, Rationale-aware Graph Contrastive Learning (RGCL)
RGCL uses a rationale generator to reveal salient features about graph instance-discrimination as the rationale, and then creates rationale-aware views for contrastive learning.
arXiv Detail & Related papers (2022-06-16T01:28:40Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - Self-supervised Consensus Representation Learning for Attributed Graph [15.729417511103602]
We introduce self-supervised learning mechanism to graph representation learning.
We propose a novel Self-supervised Consensus Representation Learning framework.
Our proposed SCRL method treats graph from two perspectives: topology graph and feature graph.
arXiv Detail & Related papers (2021-08-10T07:53:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.