UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive
Learning Framework for Text-based Recommendation
- URL: http://arxiv.org/abs/2305.15756v1
- Date: Thu, 25 May 2023 06:11:31 GMT
- Title: UniTRec: A Unified Text-to-Text Transformer and Joint Contrastive
Learning Framework for Text-based Recommendation
- Authors: Zhiming Mao, Huimin Wang, Yiming Du and Kam-fai Wong
- Abstract summary: Prior study has shown that pretrained language models (PLM) can boost the performance of text-based recommendation.
We propose a unified local- and global-attention Transformer encoder to better model two-level contexts of user history.
Our framework, UniTRec, unifies the contrastive objectives of discriminative matching scores and candidate text perplexity to jointly enhance text-based recommendation.
- Score: 17.88375225459453
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prior study has shown that pretrained language models (PLM) can boost the
performance of text-based recommendation. In contrast to previous works that
either use PLM to encode user history as a whole input text, or impose an
additional aggregation network to fuse multi-turn history representations, we
propose a unified local- and global-attention Transformer encoder to better
model two-level contexts of user history. Moreover, conditioned on user history
encoded by Transformer encoders, our framework leverages Transformer decoders
to estimate the language perplexity of candidate text items, which can serve as
a straightforward yet significant contrastive signal for user-item text
matching. Based on this, our framework, UniTRec, unifies the contrastive
objectives of discriminative matching scores and candidate text perplexity to
jointly enhance text-based recommendation. Extensive evaluation shows that
UniTRec delivers SOTA performance on three text-based recommendation tasks.
Code is available at https://github.com/Veason-silverbullet/UniTRec.
Related papers
- mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval [67.50604814528553]
We first introduce a text encoder enhanced with RoPE and unpadding, pre-trained in a native 8192-token context.
Then we construct a hybrid TRM and a cross-encoder reranker by contrastive learning.
arXiv Detail & Related papers (2024-07-29T03:12:28Z) - ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy
in Transformer [88.61312640540902]
We introduce Explicit Synergy-based Text Spotting Transformer framework (ESTextSpotter)
Our model achieves explicit synergy by modeling discriminative and interactive features for text detection and recognition within a single decoder.
Experimental results demonstrate that our model significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2023-08-20T03:22:23Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Learning Vector-Quantized Item Representation for Transferable
Sequential Recommenders [33.406897794088515]
VQ-Rec is a novel approach to learning Vector-Quantized item representations for transferable sequential Recommender.
We propose an enhanced contrastive pre-training approach, using semi-synthetic and mixed-domain code representations as hard negatives.
arXiv Detail & Related papers (2022-10-22T00:43:14Z) - JOIST: A Joint Speech and Text Streaming Model For ASR [63.15848310748753]
We present JOIST, an algorithm to train a streaming, cascaded, encoder end-to-end (E2E) model with both speech-text paired inputs, and text-only unpaired inputs.
We find that best text representation for JOIST improves WER across a variety of search and rare-word test sets by 4-14% relative, compared to a model not trained with text.
arXiv Detail & Related papers (2022-10-13T20:59:22Z) - M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation [66.92823764664206]
We propose M-Adapter, a novel Transformer-based module, to adapt speech representations to text.
While shrinking the speech sequence, M-Adapter produces features desired for speech-to-text translation.
Our experimental results show that our model outperforms a strong baseline by up to 1 BLEU.
arXiv Detail & Related papers (2022-07-03T04:26:53Z) - Text Compression-aided Transformer Encoding [77.16960983003271]
We propose explicit and implicit text compression approaches to enhance the Transformer encoding.
backbone information, meaning the gist of the input text, is not specifically focused on.
Our evaluation on benchmark datasets shows that the proposed explicit and implicit text compression approaches improve results in comparison to strong baselines.
arXiv Detail & Related papers (2021-02-11T11:28:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.