LION: A Clifford Neural Paradigm for Multimodal-Attributed Graph Learning
- URL: http://arxiv.org/abs/2601.21453v1
- Date: Thu, 29 Jan 2026 09:30:36 GMT
- Title: LION: A Clifford Neural Paradigm for Multimodal-Attributed Graph Learning
- Authors: Xunkai Li, Zhengyu Wu, Zekai Chen, Henan Sun, Daohan Su, Guang Zeng, Hongchao Qin, Rong-Hua Li, Guoren Wang,
- Abstract summary: We propose LION to implement alignment-then-fusion in multimodal-attributed graphs.<n>We first construct a modality-aware geometric manifold grounded in Clifford algebra.<n>This geometric-induced high-order graph propagation efficiently achieves modality interaction, facilitating modality alignment.
- Score: 36.90213853456115
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, the rapid advancement of multimodal domains has driven a data-centric paradigm shift in graph ML, transitioning from text-attributed to multimodal-attributed graphs. This advancement significantly enhances data representation and expands the scope of graph downstream tasks, such as modality-oriented tasks, thereby improving the practical utility of graph ML. Despite its promise, limitations exist in the current neural paradigms: (1) Neglect Context in Modality Alignment: Most existing methods adopt topology-constrained or modality-specific operators as tokenizers. These aligners inevitably neglect graph context and inhibit modality interaction, resulting in suboptimal alignment. (2) Lack of Adaptation in Modality Fusion: Most existing methods are simple adaptations for 2-modality graphs and fail to adequately exploit aligned tokens equipped with topology priors during fusion, leading to poor generalizability and performance degradation. To address the above issues, we propose LION (c\underline{LI}ff\underline{O}rd \underline{N}eural paradigm) based on the Clifford algebra and decoupled graph neural paradigm (i.e., propagation-then-aggregation) to implement alignment-then-fusion in multimodal-attributed graphs. Specifically, we first construct a modality-aware geometric manifold grounded in Clifford algebra. This geometric-induced high-order graph propagation efficiently achieves modality interaction, facilitating modality alignment. Then, based on the geometric grade properties of aligned tokens, we propose adaptive holographic aggregation. This module integrates the energy and scale of geometric grades with learnable parameters to improve modality fusion. Extensive experiments on 9 datasets demonstrate that LION significantly outperforms SOTA baselines across 3 graph and 3 modality downstream tasks.
Related papers
- VecFormer: Towards Efficient and Generalizable Graph Transformer with Graph Token Attention [61.96837866507746]
VecFormer is an efficient and highly generalizable model for node classification.<n>VecFormer outperforms the existing Graph Transformer in both performance and speed.
arXiv Detail & Related papers (2026-02-23T09:10:39Z) - Decoupling and Damping: Structurally-Regularized Gradient Matching for Multimodal Graph Condensation [3.2987327415317895]
We propose Structurally-Regularized Gradient Matching (SR-GM), a novel condensation framework tailored for multimodal graphs.<n> SR-GM significantly improves accuracy and accelerates convergence compared to baseline methods.<n>This research provides a scalable methodology for multimodal graph-based learning in resource-constrained environments.
arXiv Detail & Related papers (2025-11-25T11:50:34Z) - GILT: An LLM-Free, Tuning-Free Graph Foundational Model for In-Context Learning [50.40400074353263]
Graph Neural Networks (GNNs) are powerful tools for precessing relational data but often struggle to generalize to unseen graphs.<n>We introduce textbfGraph textbfIn-context textbfL textbfTransformer (GILT), a framework built on an LLM-free and tuning-free architecture.
arXiv Detail & Related papers (2025-10-06T08:09:15Z) - Aggregation-aware MLP: An Unsupervised Approach for Graph Message-passing [10.93155007218297]
"AMLP" is an unsupervised framework that shifts the paradigm from directly crafting aggregation functions to making adaptive aggregation.<n>Our approach consists of two key steps: First, we utilize a graph reconstruction that facilitates high-order grouping effects, and second, we employ a single-layer network to encode varying degrees of heterophily.
arXiv Detail & Related papers (2025-07-27T04:52:55Z) - Scalable Graph Generative Modeling via Substructure Sequences [50.32639806800683]
We introduce Generative Graph Pattern Machine (G$2$PM), a generative Transformer pre-training framework for graphs.<n>G$2$PM represents graph instances (nodes, edges, or entire graphs) as sequences of substructures.<n>It employs generative pre-training over the sequences to learn generalizable and transferable representations.
arXiv Detail & Related papers (2025-05-22T02:16:34Z) - Boosting Graph Neural Network Expressivity with Learnable Lanczos Constraints [7.605749412696919]
Graph Neural Networks (GNNs) excel in handling graph-structured data but often underperform in link prediction tasks.<n>We present a novel method to enhance the expressivity of GNNs by embedding induced subgraphs into the graph Laplacian matrix's eigenbasis.<n>We demonstrate the ability to distinguish graphs that are indistinguishable by 2-WL, while maintaining efficient time complexity.
arXiv Detail & Related papers (2024-08-22T12:22:00Z) - A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Enhancing Node Representations for Real-World Complex Networks with Topological Augmentation [35.42514739566419]
TopoAug is a novel graph augmentation method that builds a complex from the original graph by constructing virtual hyperedges directly from raw data.
We provide 23 novel real-world graph datasets across various domains including social media, biology, and e-commerce.
Our empirical study shows that TopoAug consistently and significantly outperforms GNN baselines and other graph augmentation methods.
arXiv Detail & Related papers (2024-02-20T14:18:43Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Counterfactual Intervention Feature Transfer for Visible-Infrared Person
Re-identification [69.45543438974963]
We find graph-based methods in the visible-infrared person re-identification task (VI-ReID) suffer from bad generalization because of two issues.
The well-trained input features weaken the learning of graph topology, making it not generalized enough during the inference process.
We propose a Counterfactual Intervention Feature Transfer (CIFT) method to tackle these problems.
arXiv Detail & Related papers (2022-08-01T16:15:31Z) - FeatureNorm: L2 Feature Normalization for Dynamic Graph Embedding [39.527059564775094]
Graph convolutional network (GCN) has been widely explored and used in non-Euclidean application domains.
In this paper, we analyze the shrinking properties in the node embedding space at first, and then design a simple yet versatile method.
Experiments on four real-world dynamic graph datasets compared with competitive baseline models demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2021-02-27T09:13:47Z) - Analyzing Unaligned Multimodal Sequence via Graph Convolution and Graph
Pooling Fusion [28.077474663199062]
We propose a novel model, termed Multimodal Graph, to investigate the effectiveness of graph neural networks (GNN) on modeling multimodal sequential data.
Our graph-based model reaches state-of-the-art performance on two benchmark datasets.
arXiv Detail & Related papers (2020-11-27T06:12:14Z) - Revisiting Graph based Collaborative Filtering: A Linear Residual Graph
Convolutional Network Approach [55.44107800525776]
Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models.
In this paper, we revisit GCN based Collaborative Filtering (CF) based Recommender Systems (RS)
We show that removing non-linearities would enhance recommendation performance, consistent with the theories in simple graph convolutional networks.
We propose a residual network structure that is specifically designed for CF with user-item interaction modeling.
arXiv Detail & Related papers (2020-01-28T04:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.