Beyond Leakage and Complexity: Towards Realistic and Efficient Information Cascade Prediction
- URL: http://arxiv.org/abs/2510.25348v1
- Date: Wed, 29 Oct 2025 10:06:08 GMT
- Title: Beyond Leakage and Complexity: Towards Realistic and Efficient Information Cascade Prediction
- Authors: Jie Peng, Rui Wang, Qiang Wang, Zhewei Wei, Bin Tong, Guan Wang,
- Abstract summary: Information cascade popularity prediction is a key problem in analyzing content diffusion in social networks.<n>We propose a time-ordered splitting strategy that chronologically partitions data into consecutive windows.<n>Second, we introduce Taoke, a large-scale e-commerce cascade dataset featuring rich promoter/product attributes.<n>Third, we develop CasTemp, a lightweight framework that efficiently models cascade dynamics through temporal walks.
- Score: 37.50536404287287
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Information cascade popularity prediction is a key problem in analyzing content diffusion in social networks. However, current related works suffer from three critical limitations: (1) temporal leakage in current evaluation--random cascade-based splits allow models to access future information, yielding unrealistic results; (2) feature-poor datasets that lack downstream conversion signals (e.g., likes, comments, or purchases), which limits more practical applications; (3) computational inefficiency of complex graph-based methods that require days of training for marginal gains. We systematically address these challenges from three perspectives: task setup, dataset construction, and model design. First, we propose a time-ordered splitting strategy that chronologically partitions data into consecutive windows, ensuring models are evaluated on genuine forecasting tasks without future information leakage. Second, we introduce Taoke, a large-scale e-commerce cascade dataset featuring rich promoter/product attributes and ground-truth purchase conversions--capturing the complete diffusion lifecycle from promotion to monetization. Third, we develop CasTemp, a lightweight framework that efficiently models cascade dynamics through temporal walks, Jaccard-based neighbor selection for inter-cascade dependencies, and GRU-based encoding with time-aware attention. Under leak-free evaluation, CasTemp achieves state-of-the-art performance across four datasets with orders-of-magnitude speedup. Notably, it excels at predicting second-stage popularity conversions--a practical task critical for real-world applications.
Related papers
- A Retrieval Augmented Spatio-Temporal Framework for Traffic Prediction [33.28893562327803]
RAST achieves superior performance while maintaining efficiency in large-scale datasets.<n>Our framework consists of three key designs: 1) Decoupled and Query Retriever to capture decoupled temporal features and construct residual fusion via Retrieval-Augmented Generation (RAG); 2) Universal Backbone Predict Storeor that accommodates pre-trained ST-GNNs or simple predictors; and 3) Universal Backbone Predict Storeor that accommodates pre-trained ST-GNNs or simple predictors.
arXiv Detail & Related papers (2025-08-14T10:11:39Z) - Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy Prediction [12.064509280163502]
3D occupancy prediction has emerged as a key perception task for autonomous driving.<n>Recent studies focus on integrating information obtained from past observations to improve prediction accuracy.<n>We propose StreamOcc, a framework that aggregates past-temporal information in a stream-based manner.<n>Experiments on the Occ3D-nus dataset show that StreamOcc achieves state-of-the-art performance in real-time settings, while reducing memory usage by more than 50% compared to previous methods.
arXiv Detail & Related papers (2025-03-28T02:05:53Z) - SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - On Your Mark, Get Set, Predict! Modeling Continuous-Time Dynamics of Cascades for Information Popularity Prediction [5.464598715181046]
Key to accurately predicting information popularity lies in subtly modeling the underlying temporal information diffusion process.
We propose ConCat, modeling the Continuous-time dynamics of Cascades for information popularity prediction.
We conduct extensive experiments to evaluate ConCat on three real-world datasets.
arXiv Detail & Related papers (2024-09-25T05:08:44Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - TimeGraphs: Graph-based Temporal Reasoning [64.18083371645956]
TimeGraphs is a novel approach that characterizes dynamic interactions as a hierarchical temporal graph.
Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales.
We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset.
arXiv Detail & Related papers (2024-01-06T06:26:49Z) - Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd
Flow Inference [23.8192952068949]
We present a novel Contrastive Self-learning framework for S-temporal data (CSST)
Our approach initiates with the construction of a spatial adjacency graph founded on the Points of Interest (POIs) and their respective distances.
We adopt a swapped prediction approach to anticipate the representation of the target subgraph from similar instances.
Our experiments, conducted on two real-world datasets, demonstrate that the CSST pre-trained on extensive noisy data consistently outperforms models trained from scratch.
arXiv Detail & Related papers (2023-09-06T02:51:24Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Incorporating Reachability Knowledge into a Multi-Spatial Graph
Convolution Based Seq2Seq Model for Traffic Forecasting [12.626657411944949]
Existing works cannot perform well for multi-step traffic prediction that involves long future time period.
Our model is evaluated on two real world traffic datasets and better performance than other competitors.
arXiv Detail & Related papers (2021-07-04T03:23:30Z) - Predicting Temporal Sets with Deep Neural Networks [50.53727580527024]
We propose an integrated solution based on the deep neural networks for temporal sets prediction.
A unique perspective is to learn element relationship by constructing set-level co-occurrence graph.
We design an attention-based module to adaptively learn the temporal dependency of elements and sets.
arXiv Detail & Related papers (2020-06-20T03:29:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.