BehaveGPT: A Foundation Model for Large-scale User Behavior Modeling
- URL: http://arxiv.org/abs/2505.17631v1
- Date: Fri, 23 May 2025 08:43:46 GMT
- Title: BehaveGPT: A Foundation Model for Large-scale User Behavior Modeling
- Authors: Jiahui Gong, Jingtao Ding, Fanjin Meng, Chen Yang, Hong Chen, Zuojian Wang, Haisheng Lu, Yong Li,
- Abstract summary: We propose BehaveGPT, a foundational model designed specifically for large-scale user behavior prediction.<n>BehaveGPT is trained on vast user behavior datasets, allowing it to learn complex behavior patterns.<n>Our approach introduces the DRO-based pretraining paradigm tailored for user behavior data, which improves model generalization and transferability.
- Score: 14.342911841456663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, foundational models have revolutionized the fields of language and vision, demonstrating remarkable abilities in understanding and generating complex data; however, similar advances in user behavior modeling have been limited, largely due to the complexity of behavioral data and the challenges involved in capturing intricate temporal and contextual relationships in user activities. To address this, we propose BehaveGPT, a foundational model designed specifically for large-scale user behavior prediction. Leveraging transformer-based architecture and a novel pretraining paradigm, BehaveGPT is trained on vast user behavior datasets, allowing it to learn complex behavior patterns and support a range of downstream tasks, including next behavior prediction, long-term generation, and cross-domain adaptation. Our approach introduces the DRO-based pretraining paradigm tailored for user behavior data, which improves model generalization and transferability by equitably modeling both head and tail behaviors. Extensive experiments on real-world datasets demonstrate that BehaveGPT outperforms state-of-the-art baselines, achieving more than a 10% improvement in macro and weighted recall, showcasing its ability to effectively capture and predict user behavior. Furthermore, we measure the scaling law in the user behavior domain for the first time on the Honor dataset, providing insights into how model performance scales with increased data and parameter sizes.
Related papers
- Large language model as user daily behavior data generator: balancing population diversity and individual personality [12.464365435176099]
We introduce BehaviorGen, a framework that uses large language models to generate high-quality synthetic behavior data.<n>By simulating user behavior based on profiles and real events, BehaviorGen supports data augmentation and replacement in behavior prediction models.<n>We evaluate its performance in scenarios such as augmentation, fine-tuning replacement, and fine-tuning augmentation, achieving significant improvements in human mobility and smartphone usage predictions.
arXiv Detail & Related papers (2025-05-23T08:22:09Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - Optimizing Sequential Recommendation Models with Scaling Laws and Approximate Entropy [104.48511402784763]
Performance Law for SR models aims to theoretically investigate and model the relationship between model performance and data quality.<n>We propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics.
arXiv Detail & Related papers (2024-11-30T10:56:30Z) - Incorporating Group Prior into Variational Inference for Tail-User Behavior Modeling in CTR Prediction [8.213386595519928]
We propose a novel variational inference approach, namely Group Prior Sampler Variational Inference (GPSVI)
GPSVI introduces group preferences as priors to refine latent user interests for tail users.
Rigorous analysis and extensive experiments demonstrate that GPSVI consistently improves the performance of tail users.
arXiv Detail & Related papers (2024-10-19T13:15:36Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - USE: Dynamic User Modeling with Stateful Sequence Models [26.74966828348815]
User Stateful Embedding (USE) generates user embeddings without the need for exhaustive reprocessing.
We introduce a novel training objective named future W-behavior prediction to transcend the limitations of next-token prediction.
We conduct experiments on 8 downstream tasks using Snapchat users' behavioral logs in both static (i.e., fixed user behavior sequences) and dynamic (i.e. periodically updated user behavior sequences) settings.
arXiv Detail & Related papers (2024-03-20T07:05:19Z) - Cumulative Distribution Function based General Temporal Point Processes [49.758080415846884]
CuFun model represents a novel approach to TPPs that revolves around the Cumulative Distribution Function (CDF)
Our approach addresses several critical issues inherent in traditional TPP modeling.
Our contributions encompass the introduction of a pioneering CDF-based TPP model, the development of a methodology for incorporating past event information into future event prediction.
arXiv Detail & Related papers (2024-02-01T07:21:30Z) - Incorporating Heterogeneous User Behaviors and Social Influences for
Predictive Analysis [32.31161268928372]
We aim to incorporate heterogeneous user behaviors and social influences for behavior predictions.
This paper proposes a variant of Long-Short Term Memory (LSTM) which can consider context while a behavior sequence.
A residual learning-based decoder is designed to automatically construct multiple high-order cross features based on social behavior representation.
arXiv Detail & Related papers (2022-07-24T17:05:37Z) - Preference Enhanced Social Influence Modeling for Network-Aware Cascade
Prediction [59.221668173521884]
We propose a novel framework to promote cascade size prediction by enhancing the user preference modeling.
Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate.
arXiv Detail & Related papers (2022-04-18T09:25:06Z) - STAR: Sparse Transformer-based Action Recognition [61.490243467748314]
This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data.
Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference.
arXiv Detail & Related papers (2021-07-15T02:53:11Z) - Learning Transferrable Parameters for Long-tailed Sequential User
Behavior Modeling [70.64257515361972]
We argue that focusing on tail users could bring more benefits and address the long tails issue.
Specifically, we propose a gradient alignment and adopt an adversarial training scheme to facilitate knowledge transfer from the head to the tail.
arXiv Detail & Related papers (2020-10-22T03:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.