Multi-Modal Multi-Behavior Sequential Recommendation with Conditional Diffusion-Based Feature Denoising
- URL: http://arxiv.org/abs/2508.05352v1
- Date: Thu, 07 Aug 2025 12:58:34 GMT
- Title: Multi-Modal Multi-Behavior Sequential Recommendation with Conditional Diffusion-Based Feature Denoising
- Authors: Xiaoxi Cui, Weihai Lu, Yu Tong, Yiheng Li, Zhejun Zhao,
- Abstract summary: This paper focuses on the problem of multi-modal multi-behavior sequential recommendation.<n>We propose a novel Multi-Modal Multi-Behavior Sequential Recommendation model (M$3$BSR)<n> Experimental results indicate that M$3$BSR significantly outperforms existing state-of-the-art methods on benchmark datasets.
- Score: 1.4207530018625354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The sequential recommendation system utilizes historical user interactions to predict preferences. Effectively integrating diverse user behavior patterns with rich multimodal information of items to enhance the accuracy of sequential recommendations is an emerging and challenging research direction. This paper focuses on the problem of multi-modal multi-behavior sequential recommendation, aiming to address the following challenges: (1) the lack of effective characterization of modal preferences across different behaviors, as user attention to different item modalities varies depending on the behavior; (2) the difficulty of effectively mitigating implicit noise in user behavior, such as unintended actions like accidental clicks; (3) the inability to handle modality noise in multi-modal representations, which further impacts the accurate modeling of user preferences. To tackle these issues, we propose a novel Multi-Modal Multi-Behavior Sequential Recommendation model (M$^3$BSR). This model first removes noise in multi-modal representations using a Conditional Diffusion Modality Denoising Layer. Subsequently, it utilizes deep behavioral information to guide the denoising of shallow behavioral data, thereby alleviating the impact of noise in implicit feedback through Conditional Diffusion Behavior Denoising. Finally, by introducing a Multi-Expert Interest Extraction Layer, M$^3$BSR explicitly models the common and specific interests across behaviors and modalities to enhance recommendation performance. Experimental results indicate that M$^3$BSR significantly outperforms existing state-of-the-art methods on benchmark datasets.
Related papers
- From Agnostic to Specific: Latent Preference Diffusion for Multi-Behavior Sequential Recommendation [28.437926520491445]
Multi-behavior sequential recommendation (MBSR) aims to learn the dynamic and heterogeneous interactions of users' multi-behavior sequences.<n>Recent concerns are shifting from behavior-fixed to behavior-specific recommendation.<n>We propose textbfFatsMB, a framework based diffusion model that guides preference generation.
arXiv Detail & Related papers (2026-02-26T15:48:09Z) - Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification [55.56234913868664]
We propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD) for reliable learning on multimodal data.<n>The proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.
arXiv Detail & Related papers (2026-01-12T03:14:12Z) - BLADE: A Behavior-Level Data Augmentation Framework with Dual Fusion Modeling for Multi-Behavior Sequential Recommendation [15.457239237638985]
BLADE is a framework that enhances multi-behavior modeling while mitigating data sparsity.<n>We introduce a dual item-behavior fusion architecture that incorporates behavior information at both the input and intermediate levels.<n>Three behavior-level data augmentation methods operate directly on behavior sequences rather than core item sequences.
arXiv Detail & Related papers (2025-12-15T04:02:53Z) - Empowering Denoising Sequential Recommendation with Large Language Model Embeddings [18.84444501128626]
Sequential recommendation aims to capture user preferences by modeling sequential patterns in user-item interactions.<n>To reduce the effect of noise, some works propose explicitly identifying and removing noisy items.<n>We propose a novel framework: Interest Alignment for Denoising Sequential Recommendation (IADSR) which integrates both collaborative and semantic information.
arXiv Detail & Related papers (2025-10-05T15:10:51Z) - HiFIRec: Towards High-Frequency yet Low-Intention Behaviors for Multi-Behavior Recommendation [10.558247582357783]
HiFIRec is a novel multi-behavior recommendation method.<n>It corrects the effect of high-frequency yet low-intention behaviors by differential behavior modeling.<n>Experiments on two benchmarks show that HiFIRec relatively improves HR@10 by 4.21%-6.81% over several state-of-the-art methods.
arXiv Detail & Related papers (2025-09-30T04:20:45Z) - I$^3$-MRec: Invariant Learning with Information Bottleneck for Incomplete Modality Recommendation [56.55935146424585]
We introduce textbfI$3$-MRec, which learns with textbfInformation bottleneck principle for textbfIncomplete textbfModality textbfRecommendation.<n>By treating each modality as a distinct semantic environment, I$3$-MRec employs invariant risk minimization (IRM) to learn preference-oriented representations.<n>I$3$-MRec consistently outperforms existing state-of-the-art MRS methods across various modality-missing scenarios
arXiv Detail & Related papers (2025-08-06T09:29:50Z) - Multimodal Difference Learning for Sequential Recommendation [5.243083216855681]
We argue that user interests and item relationships vary across different modalities.<n>We propose a novel Multimodal Learning framework for Sequential Recommendation, MDSRec.<n>Results on five real-world datasets demonstrate the superiority of MDSRec over state-of-the-art baselines.
arXiv Detail & Related papers (2024-12-11T05:08:19Z) - Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Behavior-Contextualized Item Preference Modeling for Multi-Behavior Recommendation [30.715182718492244]
This paper introduces a novel approach, Behavior-Contextualized Item Preference Modeling (BCIPM) for multi-behavior recommendation.
Our proposed Behavior-Contextualized Item Preference Network discerns and learns users' specific item preferences within each behavior.
It then considers only those preferences relevant to the target behavior for final recommendations, significantly reducing noise from auxiliary behaviors.
arXiv Detail & Related papers (2024-04-28T12:46:36Z) - TruthSR: Trustworthy Sequential Recommender Systems via User-generated Multimodal Content [21.90660366765994]
We propose a trustworthy sequential recommendation method via noisy user-generated multi-modal content.
Specifically, we capture the consistency and complementarity of user-generated multi-modal content to mitigate noise interference.
In addition, we design a trustworthy decision mechanism that integrates subjective user perspective and objective item perspective.
arXiv Detail & Related papers (2024-04-26T08:23:36Z) - LD4MRec: Simplifying and Powering Diffusion Model for Multimedia Recommendation [6.914898966090197]
We propose a Light Diffusion model for Multimedia Recommendation (LD4MRec)<n> LD4MRec employs a forward-free inference strategy, which directly predicts future behaviors from observed noisy behaviors.<n>Experiments conducted on three real-world datasets demonstrate the effectiveness of LD4MRec.
arXiv Detail & Related papers (2023-09-27T02:12:41Z) - Exploiting Modality-Specific Features For Multi-Modal Manipulation
Detection And Grounding [54.49214267905562]
We construct a transformer-based framework for multi-modal manipulation detection and grounding tasks.
Our framework simultaneously explores modality-specific features while preserving the capability for multi-modal alignment.
We propose an implicit manipulation query (IMQ) that adaptively aggregates global contextual cues within each modality.
arXiv Detail & Related papers (2023-09-22T06:55:41Z) - Diffusion Recommender Model [85.9640416600725]
We propose a novel Diffusion Recommender Model (named DiffRec) to learn the generative process in a denoising manner.<n>To retain personalized information in user interactions, DiffRec reduces the added noises and avoids corrupting users' interactions into pure noises like in image synthesis.
arXiv Detail & Related papers (2023-04-11T04:31:00Z) - Diffusion Action Segmentation [63.061058214427085]
We propose a novel framework via denoising diffusion models, which shares the same inherent spirit of such iterative refinement.
In this framework, action predictions are iteratively generated from random noise with input video features as conditions.
arXiv Detail & Related papers (2023-03-31T10:53:24Z) - Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for
Multi-Behavior Recommendation [52.89816309759537]
Multi-types of behaviors (e.g., clicking, adding to cart, purchasing, etc.) widely exist in most real-world recommendation scenarios.
The state-of-the-art multi-behavior models learn behavior dependencies indistinguishably with all historical interactions as input.
We propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning framework to learn shared and behavior-specific interests for different behaviors.
arXiv Detail & Related papers (2022-08-03T05:28:14Z) - Sequential Recommendation with Self-Attentive Multi-Adversarial Network [101.25533520688654]
We present a Multi-Factor Generative Adversarial Network (MFGAN) for explicitly modeling the effect of context information on sequential recommendation.
Our framework is flexible to incorporate multiple kinds of factor information, and is able to trace how each factor contributes to the recommendation decision over time.
arXiv Detail & Related papers (2020-05-21T12:28:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.