GAP-Net: Calibrating User Intent via Gated Adaptive Progressive Learning for CTR Prediction
- URL: http://arxiv.org/abs/2601.07613v2
- Date: Wed, 14 Jan 2026 02:43:24 GMT
- Title: GAP-Net: Calibrating User Intent via Gated Adaptive Progressive Learning for CTR Prediction
- Authors: Shenqiang Ke, Jianxiong Wei, Qingsong Hua,
- Abstract summary: GAP-Net is a unified framework establishing a "Triple Gating" architecture to progressively refine information from micro-level features to macro-level views.<n>It achieves substantial improvements over state-of-the-art baselines, exhibiting superior robustness against interaction noise and intent drift.
- Score: 0.6372261626436676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sequential user behavior modeling is pivotal for Click-Through Rate (CTR) prediction yet is hindered by three intrinsic bottlenecks: (1) the "Attention Sink" phenomenon, where standard Softmax compels the model to allocate probability mass to noisy behaviors; (2) the Static Query Assumption, which overlooks dynamic shifts in user intent driven by real-time contexts; and (3) Rigid View Aggregation, which fails to adaptively weight heterogeneous temporal signals according to the decision context. To bridge these gaps, we propose GAP-Net (Gated Adaptive Progressive Network), a unified framework establishing a "Triple Gating" architecture to progressively refine information from micro-level features to macro-level views. GAP-Net operates through three integrated mechanisms: (1) Adaptive Sparse-Gated Attention (ASGA) employs micro-level gating to enforce sparsity, effectively suppressing massive noise activations; (2) Gated Cascading Query Calibration (GCQC) dynamically aligns user intent by bridging real-time triggers and long-term memories via a meso-level cascading channel; and (3) Context-Gated Denoising Fusion (CGDF) performs macro-level modulation to orchestrate the aggregation of multi-view sequences. Extensive experiments on industrial datasets demonstrate that GAP-Net achieves substantial improvements over state-of-the-art baselines, exhibiting superior robustness against interaction noise and intent drift.
Related papers
- Beyond the Flat Sequence: Hierarchical and Preference-Aware Generative Recommendations [35.58864660038236]
We propose a novel framework named HPGR (Hierarchical and Preference-aware Generative Recommender)<n>First, a structure-aware pre-training stage employs a session-based Masked Item Modeling objective to learn a hierarchically-informed and semantically rich item representation space.<n>Second, a preference-aware fine-tuning stage leverages these powerful representations to implement a Preference-Guided Sparse Attention mechanism.
arXiv Detail & Related papers (2026-03-01T08:15:34Z) - Cascading multi-agent anomaly detection in surveillance systems via vision-language models and embedding-based classification [0.0]
This work introduces a cascading multi-agent framework that unifies complementary paradigms into a coherent and interpretable architecture.<n>Early modules perform reconstruction-gated filtering and object-level assessment, while higher-level reasoning agents are selectively invoked to interpret semantically ambiguous events.<n>The framework advances beyond conventional detection pipelines by combining early-exit efficiency, adaptive multi-agent reasoning, and explainable anomaly attribution, establishing a reproducible and energy-efficient foundation for scalable intelligent visual monitoring.
arXiv Detail & Related papers (2026-01-08T11:31:47Z) - Adaptive Causal Coordination Detection for Social Media: A Memory-Guided Framework with Semi-Supervised Learning [1.5683405037750644]
ACCD adopts a three-stage, progressive architecture that leverages a memory-guided adaptive mechanism to learn and retain optimal detection configurations.<n>We conduct a comprehensive evaluation using real-world datasets, including the Twitter IRA dataset, Reddit coordination traces, and several widely-adopted bot detection benchmarks.<n>ACCD achieves an F1-score of 87.3% in coordinated attack detection, representing a 15.2% improvement over the strongest existing baseline.
arXiv Detail & Related papers (2026-01-01T17:27:52Z) - IAR2: Improving Autoregressive Visual Generation with Semantic-Detail Associated Token Prediction [77.06211178777939]
IAR2 is an advanced autoregressive framework that enables a hierarchical semantic-detail synthesis process.<n>We show that IAR2 sets a new state-of-the-art for autoregressive image generation, achieving a FID of 1.50 on ImageNet.
arXiv Detail & Related papers (2025-10-08T12:08:21Z) - Towards Efficient General Feature Prediction in Masked Skeleton Modeling [59.46799426434277]
We propose a novel General Feature Prediction framework (GFP) for efficient mask skeleton modeling.<n>Our key innovation is replacing conventional low-level reconstruction with high-level feature prediction that spans from local motion patterns to global semantic representations.
arXiv Detail & Related papers (2025-09-03T18:05:02Z) - Contrastive Matrix Completion with Denoising and Augmented Graph Views for Robust Recommendation [1.0128808054306186]
Matrix completion is a widely adopted framework in recommender systems.<n>We propose a novel method called Matrix Completion using Contrastive Learning (MCCL)<n>Our approach not only improves the numerical accuracy of the predicted scores--but also produces superior rankings with improvements of up to 36% in ranking metrics.
arXiv Detail & Related papers (2025-06-12T12:47:35Z) - Analysis of Anonymous User Interaction Relationships and Prediction of Advertising Feedback Based on Graph Neural Network [5.250286096386298]
We propose Decoupled Temporal-Hierarchical Graph Neural Network (DTH-GNN), which achieves three main contributions.<n> Firstly, we introduce temporal edge decomposition, which divides each interaction into three types of channels: short-term burst, diurnal cycle and long-range memory, and conducts feature extraction using the convolution kernel of parallel dilated residuals.<n>Thirdly, the contrast of feedback perception is formulated, the consistency of various time slices is maximized, the entropy of control exposure information with dual-view target is maximized, and the global prototype of dual-momentum queue distillation is presented.
arXiv Detail & Related papers (2025-06-11T04:40:24Z) - Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection [70.84835546732738]
RGB-Thermal Salient Object Detection aims to pinpoint prominent objects within aligned pairs of visible and thermal infrared images.<n>Traditional encoder-decoder architectures may not have adequately considered the robustness against noise originating from defective modalities.<n>We propose the ConTriNet, a robust Confluent Triple-Flow Network employing a Divide-and-Conquer strategy.
arXiv Detail & Related papers (2024-12-02T14:44:39Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.<n>Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.<n>Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z) - CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement [5.766499647507758]
We further develop the conformer-based metric generative adversarial network (CMGAN) model for speech enhancement (SE) in the time-frequency (TF) domain.
Our findings show that CMGAN outperforms existing state-of-the-art methods in the three major speech enhancement tasks: denoising, dereverberation, and super-resolution.
arXiv Detail & Related papers (2022-09-22T15:50:21Z) - GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector [156.43671738038657]
We present a novel end-to-end group collaborative learning network, termed GCoNet+.
GCoNet+ can effectively and efficiently identify co-salient objects in natural scenes.
arXiv Detail & Related papers (2022-05-30T23:49:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.