DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for
CTR Prediction
- URL: http://arxiv.org/abs/2305.04891v3
- Date: Tue, 5 Sep 2023 07:24:00 GMT
- Title: DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for
CTR Prediction
- Authors: Chen Zhu, Liang Du, Hong Chen, Shuang Zhao, Zixun Sun, Xin Wang, Wenwu
Zhu
- Abstract summary: Click-Through Rate (CTR) prediction is a pivotal task in product and content recommendation.
We propose a model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction.
- Score: 61.68415731896613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Click-Through Rate (CTR) prediction is a pivotal task in product and content
recommendation, where learning effective feature embeddings is of great
significance. However, traditional methods typically learn fixed feature
representations without dynamically refining feature representations according
to the context information, leading to suboptimal performance. Some recent
approaches attempt to address this issue by learning bit-wise weights or
augmented embeddings for feature representations, but suffer from uninformative
or redundant features in the context. To tackle this problem, inspired by the
Global Workspace Theory in conscious processing, which posits that only a
specific subset of the product features are pertinent while the rest can be
noisy and even detrimental to human-click behaviors, we propose a CTR model
that enables Dynamic Embedding Learning with Truncated Conscious Attention for
CTR prediction, termed DELTA. DELTA contains two key components: (I) conscious
truncation module (CTM), which utilizes curriculum learning to apply adaptive
truncation on attention weights to select the most critical feature in the
context; (II) explicit embedding optimization (EEO), which applies an auxiliary
task during training that directly and independently propagates the gradient
from the loss layer to the embedding layer, thereby optimizing the embedding
explicitly via linear feature crossing. Extensive experiments on five
challenging CTR datasets demonstrate that DELTA achieves new state-of-art
performance among current CTR methods.
Related papers
- Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Learning to Detour: Shortcut Mitigating Augmentation for Weakly Supervised Semantic Segmentation [7.5856806269316825]
Weakly supervised semantic segmentation (WSSS) employing weak forms of labels has been actively studied to alleviate the annotation cost of acquiring pixel-level labels.
We propose shortcut mitigating augmentation (SMA) for WSSS, which generates synthetic representations of object-background combinations not seen in the training data to reduce the use of shortcut features.
arXiv Detail & Related papers (2024-05-28T13:07:35Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - MAP: A Model-agnostic Pretraining Framework for Click-through Rate
Prediction [39.48740397029264]
We propose a Model-agnostic pretraining (MAP) framework that applies feature corruption and recovery on multi-field categorical data.
We derive two practical algorithms: masked feature prediction (RFD) and replaced feature detection (RFD)
arXiv Detail & Related papers (2023-08-03T12:55:55Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Knowledge Diffusion for Distillation [53.908314960324915]
The representation gap between teacher and student is an emerging topic in knowledge distillation (KD)
We state that the essence of these methods is to discard the noisy information and distill the valuable information in the feature.
We propose a novel KD method dubbed DiffKD, to explicitly denoise and match features using diffusion models.
arXiv Detail & Related papers (2023-05-25T04:49:34Z) - CL4CTR: A Contrastive Learning Framework for CTR Prediction [14.968714571151509]
We introduce self-supervised learning to produce high-quality feature representations directly.
We propose a model-agnostic Contrastive Learning for CTR (CL4CTR) framework consisting of three self-supervised learning signals.
CL4CTR achieves the best performance on four datasets.
arXiv Detail & Related papers (2022-12-01T14:18:02Z) - Learning Deep Representations via Contrastive Learning for Instance
Retrieval [11.736450745549792]
This paper makes the first attempt that tackles the problem using instance-discrimination based contrastive learning (CL)
In this work, we approach this problem by exploring the capability of deriving discriminative representations from pre-trained and fine-tuned CL models.
arXiv Detail & Related papers (2022-09-28T04:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.