Related papers: GAP-Net: Calibrating User Intent via Gated Adaptive Progressive Learning for CTR Prediction

GAP-Net: Calibrating User Intent via Gated Adaptive Progressive Learning for CTR Prediction

URL: http://arxiv.org/abs/2601.07613v2
Date: Wed, 14 Jan 2026 02:43:24 GMT
Title: GAP-Net: Calibrating User Intent via Gated Adaptive Progressive Learning for CTR Prediction
Authors: Shenqiang Ke, Jianxiong Wei, Qingsong Hua,
Abstract summary: GAP-Net is a unified framework establishing a "Triple Gating" architecture to progressively refine information from micro-level features to macro-level views.<n>It achieves substantial improvements over state-of-the-art baselines, exhibiting superior robustness against interaction noise and intent drift.
Score: 0.6372261626436676
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sequential user behavior modeling is pivotal for Click-Through Rate (CTR) prediction yet is hindered by three intrinsic bottlenecks: (1) the "Attention Sink" phenomenon, where standard Softmax compels the model to allocate probability mass to noisy behaviors; (2) the Static Query Assumption, which overlooks dynamic shifts in user intent driven by real-time contexts; and (3) Rigid View Aggregation, which fails to adaptively weight heterogeneous temporal signals according to the decision context. To bridge these gaps, we propose GAP-Net (Gated Adaptive Progressive Network), a unified framework establishing a "Triple Gating" architecture to progressively refine information from micro-level features to macro-level views. GAP-Net operates through three integrated mechanisms: (1) Adaptive Sparse-Gated Attention (ASGA) employs micro-level gating to enforce sparsity, effectively suppressing massive noise activations; (2) Gated Cascading Query Calibration (GCQC) dynamically aligns user intent by bridging real-time triggers and long-term memories via a meso-level cascading channel; and (3) Context-Gated Denoising Fusion (CGDF) performs macro-level modulation to orchestrate the aggregation of multi-view sequences. Extensive experiments on industrial datasets demonstrate that GAP-Net achieves substantial improvements over state-of-the-art baselines, exhibiting superior robustness against interaction noise and intent drift.

Related papers

Beyond the Flat Sequence: Hierarchical and Preference-Aware Generative Recommendations [35.58864660038236]
We propose a novel framework named HPGR (Hierarchical and Preference-aware Generative Recommender)<n>First, a structure-aware pre-training stage employs a session-based Masked Item Modeling objective to learn a hierarchically-informed and semantically rich item representation space.<n>Second, a preference-aware fine-tuning stage leverages these powerful representations to implement a Preference-Guided Sparse Attention mechanism.
arXiv Detail & Related papers (2026-03-01T08:15:34Z)
Cascading multi-agent anomaly detection in surveillance systems via vision-language models and embedding-based classification [0.0]
This work introduces a cascading multi-agent framework that unifies complementary paradigms into a coherent and interpretable architecture.<n>Early modules perform reconstruction-gated filtering and object-level assessment, while higher-level reasoning agents are selectively invoked to interpret semantically ambiguous events.<n>The framework advances beyond conventional detection pipelines by combining early-exit efficiency, adaptive multi-agent reasoning, and explainable anomaly attribution, establishing a reproducible and energy-efficient foundation for scalable intelligent visual monitoring.
arXiv Detail & Related papers (2026-01-08T11:31:47Z)
Adaptive Causal Coordination Detection for Social Media: A Memory-Guided Framework with Semi-Supervised Learning [1.5683405037750644]
ACCD adopts a three-stage, progressive architecture that leverages a memory-guided adaptive mechanism to learn and retain optimal detection configurations.<n>We conduct a comprehensive evaluation using real-world datasets, including the Twitter IRA dataset, Reddit coordination traces, and several widely-adopted bot detection benchmarks.<n>ACCD achieves an F1-score of 87.3% in coordinated attack detection, representing a 15.2% improvement over the strongest existing baseline.
arXiv Detail & Related papers (2026-01-01T17:27:52Z)
IAR2: Improving Autoregressive Visual Generation with Semantic-Detail Associated Token Prediction [77.06211178777939]
IAR2 is an advanced autoregressive framework that enables a hierarchical semantic-detail synthesis process.<n>We show that IAR2 sets a new state-of-the-art for autoregressive image generation, achieving a FID of 1.50 on ImageNet.
arXiv Detail & Related papers (2025-10-08T12:08:21Z)
Towards Efficient General Feature Prediction in Masked Skeleton Modeling [59.46799426434277]
We propose a novel General Feature Prediction framework (GFP) for efficient mask skeleton modeling.<n>Our key innovation is replacing conventional low-level reconstruction with high-level feature prediction that spans from local motion patterns to global semantic representations.
arXiv Detail & Related papers (2025-09-03T18:05:02Z)
Contrastive Matrix Completion with Denoising and Augmented Graph Views for Robust Recommendation [1.0128808054306186]
Matrix completion is a widely adopted framework in recommender systems.<n>We propose a novel method called Matrix Completion using Contrastive Learning (MCCL)<n>Our approach not only improves the numerical accuracy of the predicted scores--but also produces superior rankings with improvements of up to 36% in ranking metrics.
arXiv Detail & Related papers (2025-06-12T12:47:35Z)
Analysis of Anonymous User Interaction Relationships and Prediction of Advertising Feedback Based on Graph Neural Network [5.250286096386298]
We propose Decoupled Temporal-Hierarchical Graph Neural Network (DTH-GNN), which achieves three main contributions.<n> Firstly, we introduce temporal edge decomposition, which divides each interaction into three types of channels: short-term burst, diurnal cycle and long-range memory, and conducts feature extraction using the convolution kernel of parallel dilated residuals.<n>Thirdly, the contrast of feedback perception is formulated, the consistency of various time slices is maximized, the entropy of control exposure information with dual-view target is maximized, and the global prototype of dual-momentum queue distillation is presented.
arXiv Detail & Related papers (2025-06-11T04:40:24Z)
Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection [70.84835546732738]
RGB-Thermal Salient Object Detection aims to pinpoint prominent objects within aligned pairs of visible and thermal infrared images.<n>Traditional encoder-decoder architectures may not have adequately considered the robustness against noise originating from defective modalities.<n>We propose the ConTriNet, a robust Confluent Triple-Flow Network employing a Divide-and-Conquer strategy.
arXiv Detail & Related papers (2024-12-02T14:44:39Z)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR) Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z)
Frequency Perception Network for Camouflaged Object Detection [51.26386921922031]
We propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain.<n>Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage.<n>Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets.
arXiv Detail & Related papers (2023-08-17T11:30:46Z)
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement [5.766499647507758]
We further develop the conformer-based metric generative adversarial network (CMGAN) model for speech enhancement (SE) in the time-frequency (TF) domain. Our findings show that CMGAN outperforms existing state-of-the-art methods in the three major speech enhancement tasks: denoising, dereverberation, and super-resolution.
arXiv Detail & Related papers (2022-09-22T15:50:21Z)
GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector [156.43671738038657]
We present a novel end-to-end group collaborative learning network, termed GCoNet+. GCoNet+ can effectively and efficiently identify co-salient objects in natural scenes.
arXiv Detail & Related papers (2022-05-30T23:49:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.