Rethinking Contrastive Learning in Session-based Recommendation
- URL: http://arxiv.org/abs/2506.05044v1
- Date: Thu, 05 Jun 2025 13:52:57 GMT
- Title: Rethinking Contrastive Learning in Session-based Recommendation
- Authors: Xiaokun Zhang, Bo Xu, Fenglong Ma, Zhizheng Wang, Liang Yang, Hongfei Lin,
- Abstract summary: Session-based recommendation aims to predict intents of anonymous users based on limited behaviors.<n>We propose a novel multi-modal adaptive contrastive learning framework called MACL for session-based recommendation.
- Score: 31.888392523713435
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Session-based recommendation aims to predict intents of anonymous users based on limited behaviors. With the ability in alleviating data sparsity, contrastive learning is prevailing in the task. However, we spot that existing contrastive learning based methods still suffer from three obstacles: (1) they overlook item-level sparsity and primarily focus on session-level sparsity; (2) they typically augment sessions using item IDs like crop, mask and reorder, failing to ensure the semantic consistency of augmented views; (3) they treat all positive-negative signals equally, without considering their varying utility. To this end, we propose a novel multi-modal adaptive contrastive learning framework called MACL for session-based recommendation. In MACL, a multi-modal augmentation is devised to generate semantically consistent views at both item and session levels by leveraging item multi-modal features. Besides, we present an adaptive contrastive loss that distinguishes varying contributions of positive-negative signals to improve self-supervised learning. Extensive experiments on three real-world datasets demonstrate the superiority of MACL over state-of-the-art methods.
Related papers
- Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
arXiv Detail & Related papers (2025-08-03T23:48:46Z) - MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings [75.0617088717528]
MoCa is a framework for transforming pre-trained VLM backbones into effective bidirectional embedding models.<n>MoCa consistently improves performance across MMEB and ViDoRe-v2 benchmarks, achieving new state-of-the-art results.
arXiv Detail & Related papers (2025-06-29T06:41:00Z) - Semantic-Aligned Learning with Collaborative Refinement for Unsupervised VI-ReID [82.12123628480371]
Unsupervised person re-identification (USL-VI-ReID) seeks to match pedestrian images of the same individual across different modalities without human annotations for model learning.<n>Previous methods unify pseudo-labels of cross-modality images through label association algorithms and then design contrastive learning framework for global feature learning.<n>We propose a Semantic-Aligned Learning with Collaborative Refinement (SALCR) framework, which builds up objective for specific fine-grained patterns emphasized by each modality.
arXiv Detail & Related papers (2025-04-27T13:58:12Z) - Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation [17.18176550968383]
We propose a novel approach named Semantic Retrieval Augmented Contrastive Learning (SRA-CL), which leverages semantic information to improve the reliability of contrastive samples.<n>SRA-CL comprises two main components: (1) Cross-Sequence Contrastive Learning via User Semantic Retrieval, which utilizes large language models (LLMs) to understand diverse user preferences and retrieve semantically similar users to form reliable positive samples through a learnable sample method; and (2) Intra-Sequence Contrastive Learning via Item Semantic Retrieval, which employs LLMs to comprehend items and retrieve similar items to perform semantic-based item substitution
arXiv Detail & Related papers (2025-03-06T07:25:19Z) - Learning the Unlearned: Mitigating Feature Suppression in Contrastive Learning [45.25602203155762]
Self-Supervised Contrastive Learning has proven effective in deriving high-quality representations from unlabeled data.
A major challenge that hinders both unimodal and multimodal contrastive learning is feature suppression.
We propose a novel model-agnostic Multistage Contrastive Learning framework.
arXiv Detail & Related papers (2024-02-19T04:13:33Z) - Improving One-class Recommendation with Multi-tasking on Various
Preference Intensities [1.8416014644193064]
In one-class recommendation, it's required to make recommendations based on users' implicit feedback.
We propose a multi-tasking framework taking various preference intensities of each signal from implicit feedback into consideration.
Our method performs better than state-of-the-art methods by a large margin on three large-scale real-world benchmark datasets.
arXiv Detail & Related papers (2024-01-18T18:59:55Z) - Adaptive In-Context Learning with Large Language Models for Bundle Generation [31.667010709144773]
This paper explores two interrelated tasks, i.e., personalized bundle generation and the underlying intent inference, based on different user sessions.<n>Inspired by the reasoning capabilities of large language models (LLMs), we propose an adaptive in-context learning paradigm.<n> Experiments on three real-world datasets demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2023-12-26T08:24:24Z) - DRDT: Dynamic Reflection with Divergent Thinking for LLM-based
Sequential Recommendation [53.62727171363384]
We introduce a novel reasoning principle: Dynamic Reflection with Divergent Thinking.
Our methodology is dynamic reflection, a process that emulates human learning through probing, critiquing, and reflecting.
We evaluate our approach on three datasets using six pre-trained LLMs.
arXiv Detail & Related papers (2023-12-18T16:41:22Z) - DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning [37.48292304239107]
We present a transformer-based end-to-end ZSL method named DUET.
We develop a cross-modal semantic grounding network to investigate the model's capability of disentangling semantic attributes from the images.
We find that DUET can often achieve state-of-the-art performance, its components are effective and its predictions are interpretable.
arXiv Detail & Related papers (2022-07-04T11:12:12Z) - Intent Contrastive Learning for Sequential Recommendation [86.54439927038968]
We introduce a latent variable to represent users' intents and learn the distribution function of the latent variable via clustering.
We propose to leverage the learned intents into SR models via contrastive SSL, which maximizes the agreement between a view of sequence and its corresponding intent.
Experiments conducted on four real-world datasets demonstrate the superiority of the proposed learning paradigm.
arXiv Detail & Related papers (2022-02-05T09:24:13Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - Few-Shot Fine-Grained Action Recognition via Bidirectional Attention and
Contrastive Meta-Learning [51.03781020616402]
Fine-grained action recognition is attracting increasing attention due to the emerging demand of specific action understanding in real-world applications.
We propose a few-shot fine-grained action recognition problem, aiming to recognize novel fine-grained actions with only few samples given for each class.
Although progress has been made in coarse-grained actions, existing few-shot recognition methods encounter two issues handling fine-grained actions.
arXiv Detail & Related papers (2021-08-15T02:21:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.