Related papers: FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction

FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction

URL: http://arxiv.org/abs/2310.19453v3
Date: Tue, 7 May 2024 06:44:46 GMT
Title: FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction
Authors: Hangyu Wang, Jianghao Lin, Xiangyang Li, Bo Chen, Chenxu Zhu, Ruiming Tang, Weinan Zhang, Yong Yu,
Abstract summary: We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models (FLIP) for click-through rate (CTR) prediction. Specifically, the masked data of one modality (i.e., tokens or features) has to be recovered with the help of the other modality, which establishes the feature-level interaction and alignment. Experiments on three real-world datasets demonstrate that FLIP outperforms SOTA baselines, and is highly compatible for various ID-based models and PLMs.
Score: 49.510163437116645
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Click-through rate (CTR) prediction plays as a core function module in various personalized online services. The traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality, which capture the collaborative signals via feature interaction modeling. But the one-hot encoding discards the semantic information conceived in the original feature texts. Recently, the emergence of Pretrained Language Models (PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality obtained by hard prompt templates and adopts PLMs to extract the semantic knowledge. However, PLMs generally tokenize the input text data into subword tokens and ignore field-wise collaborative signals. Therefore, these two lines of research focus on different characteristics of the same input data (i.e., textual and tabular modalities), forming a distinct complementary relationship with each other. In this paper, we propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models (FLIP) for CTR prediction. We design a novel joint reconstruction pretraining task for both masked language and tabular modeling. Specifically, the masked data of one modality (i.e., tokens or features) has to be recovered with the help of the other modality, which establishes the feature-level interaction and alignment via sufficient mutual information extraction between dual modalities. Moreover, we propose to jointly finetune the ID-based model and PLM for downstream CTR prediction tasks, thus achieving superior performance by combining the advantages of both models. Extensive experiments on three real-world datasets demonstrate that FLIP outperforms SOTA baselines, and is highly compatible for various ID-based models and PLMs. The code is at \url{https://github.com/justarter/FLIP}.

Related papers

FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs [5.35588281968644]
We propose a novel framework, named Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs (Fine CLIPER) Our Fine CLIPER achieves tunable SOTA performance on the DFEW, FERV39k, and MAFW datasets with few parameters.
arXiv Detail & Related papers (2024-07-02T10:55:43Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [71.85120354973073]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems. Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs) We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
ClickPrompt: CTR Models are Strong Prompt Generators for Adapting Language Models to CTR Prediction [45.15127775876369]
Click-through rate (CTR) prediction has become increasingly indispensable for various Internet applications. Traditional CTR models convert the multi-field categorical data into ID features via one-hot encoding, and extract the collaborative signals among features. We propose a novel model-agnostic framework (i.e., ClickPrompt) where we incorporate CTR models to generate interaction-aware soft prompts.
arXiv Detail & Related papers (2023-10-13T16:37:53Z)
Co-Driven Recognition of Semantic Consistency via the Fusion of Transformer and HowNet Sememes Knowledge [6.184249194474601]
This paper proposes a co-driven semantic consistency recognition method based on the fusion of Transformer and HowNet sememes knowledge. BiLSTM is exploited to encode the conceptual semantic information and infer the semantic consistency.
arXiv Detail & Related papers (2023-02-21T09:53:19Z)
Guiding the PLMs with Semantic Anchors as Intermediate Supervision: Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network. By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks. We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z)
Incorporating Linguistic Knowledge for Abstractive Multi-document Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model. We process the dependency information into the linguistic-guided attention mechanism. With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training [152.63467944568094]
We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks.
arXiv Detail & Related papers (2020-02-28T15:28:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.