GraphHash: Graph Clustering Enables Parameter Efficiency in Recommender Systems
- URL: http://arxiv.org/abs/2412.17245v2
- Date: Sat, 08 Feb 2025 05:54:48 GMT
- Title: GraphHash: Graph Clustering Enables Parameter Efficiency in Recommender Systems
- Authors: Xinyi Wu, Donald Loveland, Runjin Chen, Yozen Liu, Xin Chen, Leonardo Neves, Ali Jadbabaie, Clark Mingxuan Ju, Neil Shah, Tong Zhao,
- Abstract summary: This paper introduces GraphHash, the first graph-based approach that leverages modularity-based bipartite graph clustering to reduce embedding table sizes.
By employing fast clustering algorithms, GraphHash serves as a computationally efficient proxy for message-passing during preprocessing.
In experiments, GraphHash substantially outperforms diverse hashing baselines on both retrieval and click-through-rate prediction tasks.
- Score: 51.64666652517944
- License:
- Abstract: Deep recommender systems rely heavily on large embedding tables to handle high-cardinality categorical features such as user/item identifiers, and face significant memory constraints at scale. To tackle this challenge, hashing techniques are often employed to map multiple entities to the same embedding and thus reduce the size of the embedding tables. Concurrently, graph-based collaborative signals have emerged as powerful tools in recommender systems, yet their potential for optimizing embedding table reduction remains unexplored. This paper introduces GraphHash, the first graph-based approach that leverages modularity-based bipartite graph clustering on user-item interaction graphs to reduce embedding table sizes. We demonstrate that the modularity objective has a theoretical connection to message-passing, which provides a foundation for our method. By employing fast clustering algorithms, GraphHash serves as a computationally efficient proxy for message-passing during preprocessing and a plug-and-play graph-based alternative to traditional ID hashing. Extensive experiments show that GraphHash substantially outperforms diverse hashing baselines on both retrieval and click-through-rate prediction tasks. In particular, GraphHash achieves on average a 101.52% improvement in recall when reducing the embedding table size by more than 75%, highlighting the value of graph-based collaborative information for model reduction. Our code is available at https://github.com/snap-research/GraphHash.
Related papers
- An Automatic Graph Construction Framework based on Large Language Models for Recommendation [49.51799417575638]
We introduce AutoGraph, an automatic graph construction framework based on large language models for recommendation.
LLMs infer the user preference and item knowledge, which is encoded as semantic vectors.
Latent factors are incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics.
arXiv Detail & Related papers (2024-12-24T07:51:29Z) - Careful Selection and Thoughtful Discarding: Graph Explicit Pooling
Utilizing Discarded Nodes [53.08068729187698]
We introduce a novel Graph Explicit Pooling (GrePool) method, which selects nodes by explicitly leveraging the relationships between the nodes and final representation vectors crucial for classification.
We conduct comprehensive experiments across 12 widely used datasets to validate our proposed method's effectiveness.
arXiv Detail & Related papers (2023-11-21T14:44:51Z) - Learning Optimal Graph Filters for Clustering of Attributed Graphs [20.810096547938166]
Many real-world systems can be represented as graphs where the different entities in the system are presented by nodes and their interactions by edges.
An important task in studying large datasets with graphical structure is graph clustering.
We introduce a graph signal processing based approach, where we learn the parameters of Finite Impulse Response (FIR) and Autoregressive Moving Average (ARMA) graph filters optimized for clustering.
arXiv Detail & Related papers (2022-11-09T01:49:23Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Graph Pooling with Maximum-Weight $k$-Independent Sets [12.251091325930837]
We introduce a graph coarsening mechanism based on the graph-theoretic concept of maximum-weight $k$-independent sets.
We prove theoretical guarantees for distortion bounds on path lengths, as well as the ability to preserve key topological properties in the coarsened graphs.
arXiv Detail & Related papers (2022-08-06T14:12:47Z) - DOTIN: Dropping Task-Irrelevant Nodes for GNNs [119.17997089267124]
Recent graph learning approaches have introduced the pooling strategy to reduce the size of graphs for learning.
We design a new approach called DOTIN (underlineDrunderlineopping underlineTask-underlineIrrelevant underlineNodes) to reduce the size of graphs.
Our method speeds up GAT by about 50% on graph-level tasks including graph classification and graph edit distance.
arXiv Detail & Related papers (2022-04-28T12:00:39Z) - Edge but not Least: Cross-View Graph Pooling [76.71497833616024]
This paper presents a cross-view graph pooling (Co-Pooling) method to better exploit crucial graph structure information.
Through cross-view interaction, edge-view pooling and node-view pooling seamlessly reinforce each other to learn more informative graph-level representations.
arXiv Detail & Related papers (2021-09-24T08:01:23Z) - Accurate Learning of Graph Representations with Graph Multiset Pooling [45.72542969364438]
We propose a Graph Multiset Transformer (GMT) that captures the interaction between nodes according to their structural dependencies.
Our experimental results show that GMT significantly outperforms state-of-the-art graph pooling methods on graph classification benchmarks.
arXiv Detail & Related papers (2021-02-23T07:45:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.