Related papers: Prompt-to-Slate: Diffusion Models for Prompt-Conditioned Slate Generation

Related papers

Variable-Length Semantic IDs for Recommender Systems [0.0]
A key challenge in recommender systems is the extremely large cardinality of item spaces.<n>Existing approaches generate semantic identifiers of fixed length, assigning the same description length to all items.<n>This is inefficient, misaligned with natural language, and ignores the highly skewed frequency structure of real-world catalogs.<n>We propose a discrete variational autoencoder that learns item representations of adaptive length under a principled probabilistic framework.
arXiv Detail & Related papers (2026-02-18T11:29:05Z)
OFA-MAS: One-for-All Multi-Agent System Topology Design based on Mixture-of-Experts Graph Generative Models [57.94189874119267]
Multi-Agent Systems (MAS) offer a powerful paradigm for solving complex problems.<n>Current graph learning-based design methodologies often adhere to a "one-for-one" paradigm.<n>We propose OFA-TAD, a one-for-all framework that generates adaptive collaboration graphs for any task described in natural language.
arXiv Detail & Related papers (2026-01-19T12:23:44Z)
Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations [70.94563079082751]
E-commerce has exposed the limitations of traditional product retrieval systems in managing complex, multi-turn user interactions.<n>We propose a novel framework that introduces test-time scaling into conversational multimodal product retrieval.<n>Our approach builds on a generative retriever, further augmented with a test-time reranking mechanism that improves retrieval accuracy and better aligns results with evolving user intent throughout the dialogue.
arXiv Detail & Related papers (2025-08-25T15:38:56Z)
MMQ: Multimodal Mixture-of-Quantization Tokenization for Semantic ID Generation and User Behavioral Adaptation [16.81485354427923]
We propose Multimodal Mixture-of-Quantization (MMQ), a two-stage framework that trains a novel multimodal tokenizer.<n> MMQ unifies multimodal synergy, specificity, and behavioral adaptation, providing a scalable and versatile solution for both generative retrieval and discriminative ranking tasks.
arXiv Detail & Related papers (2025-08-21T06:15:49Z)
M^2VAE: Multi-Modal Multi-View Variational Autoencoder for Cold-start Item Recommendation [14.644213412218742]
Cold-start item recommendation is a significant challenge in recommendation systems.<n>Existing methods leverage multi-modal content to alleviate the cold-start issue.<n>We propose a generative model that addresses the challenges of modeling common and unique views in attribute and multi-modal features.
arXiv Detail & Related papers (2025-08-01T09:16:26Z)
Breaker: Removing Shortcut Cues with User Clustering for Single-slot Recommendation System [20.933282101797214]
In a single-slot recommendation system, users are only exposed to one item at a time, and the system cannot collect user feedback on multiple items simultaneously.<n>This paper introduces the Breaker model, which integrates an auxiliary task of user representation clustering with a multi-tower structure for cluster-specific preference modeling.<n>It has already been deployed and is actively serving tens of millions of users daily on Meituan, one of the most popular e-commerce platforms for services.
arXiv Detail & Related papers (2025-06-01T04:23:06Z)
IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification [60.38841251693781]
We propose a novel framework to generate robust multi-modal object ReIDs.<n>Our framework uses Modal Prefixes and InverseNet to integrate multi-modal information with semantic guidance from inverted text.<n>Experiments on three multi-modal object ReID benchmarks demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2025-03-13T13:00:31Z)
MixRec: Heterogeneous Graph Collaborative Filtering [21.96510707666373]
We present a graph collaborative filtering model MixRec to disentangling users' multi-behavior interaction patterns. Our model achieves this by incorporating intent disentanglement and multi-behavior modeling. We also introduce a novel contrastive learning paradigm that adaptively explores the advantages of self-supervised data augmentation.
arXiv Detail & Related papers (2024-12-18T13:12:36Z)
Facet-Aware Multi-Head Mixture-of-Experts Model for Sequential Recommendation [25.516648802281626]
We propose a novel structure called Facet-Aware Multi-Head Mixture-of-Experts Model for Sequential Recommendation (FAME) We leverage sub-embeddings from each head in the last multi-head attention layer to predict the next item separately. A gating mechanism integrates recommendations from each head and dynamically determines their importance.
arXiv Detail & Related papers (2024-11-03T06:47:45Z)
Retrieval Augmentation via User Interest Clustering [57.63883506013693]
Industrial recommender systems are sensitive to the patterns of user-item engagement. We propose a novel approach that efficiently constructs user interest and facilitates low computational cost inference. Our approach has been deployed in multiple products at Meta, facilitating short-form video related recommendation.
arXiv Detail & Related papers (2024-08-07T16:35:10Z)
LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation [58.04939553630209]
In real-world systems, most users interact with only a handful of items, while the majority of items are seldom consumed. These two issues, known as the long-tail user and long-tail item challenges, often pose difficulties for existing Sequential Recommendation systems. We propose the Large Language Models Enhancement framework for Sequential Recommendation (LLM-ESR) to address these challenges.
arXiv Detail & Related papers (2024-05-31T07:24:42Z)
MMGRec: Multimodal Generative Recommendation with Transformer Model [81.61896141495144]
MMGRec aims to introduce a generative paradigm into multimodal recommendation. We first devise a hierarchical quantization method Graph CF-RQVAE to assign Rec-ID for each item from its multimodal information. We then train a Transformer-based recommender to generate the Rec-IDs of user-preferred items based on historical interaction sequences.
arXiv Detail & Related papers (2024-04-25T12:11:27Z)
Generative Multi-modal Models are Good Class-Incremental Learners [51.5648732517187]
We propose a novel generative multi-modal model (GMM) framework for class-incremental learning. Our approach directly generates labels for images using an adapted generative model. Under the Few-shot CIL setting, we have improved by at least 14% accuracy over all the current state-of-the-art methods with significantly less forgetting.
arXiv Detail & Related papers (2024-03-27T09:21:07Z)
DiffusionGPT: LLM-Driven Text-to-Image Generation System [39.15054464137383]
DiffusionGPT offers a unified generation system capable of seamlessly accommodating various types of prompts and integrating domain-expert models. LLM parses the prompt and employs the Trees-of-Thought to guide the selection of an appropriate model, thereby relaxing input constraints. We introduce Advantage Databases, where the Tree-of-Thought is enriched with human feedback, aligning the model selection process with human preferences.
arXiv Detail & Related papers (2024-01-18T15:30:58Z)
Benchmarking Diverse-Modal Entity Linking with Generative Models [78.93737257356784]
We construct a benchmark for diverse-modal EL (DMEL) from existing EL datasets. To approach the DMEL task, we proposed a generative diverse-modal model (GDMM) following a multimodal-encoder-decoder paradigm. GDMM builds a stronger DMEL baseline, outperforming state-of-the-art task-specific EL models by 8.51 F1 score on average.
arXiv Detail & Related papers (2023-05-27T02:38:46Z)
Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation [33.97708796846252]
We introduce a new Multi-Behavior Hypergraph-enhanced Transformer framework (MBHT) to capture both short-term and long-term cross-type behavior dependencies. Specifically, a multi-scale Transformer is equipped with low-rank self-attention to jointly encode behavior-aware sequential patterns from fine-grained and coarse-grained levels.
arXiv Detail & Related papers (2022-07-12T15:07:21Z)
Multi-Behavior Sequential Recommendation with Temporal Graph Transformer [66.10169268762014]
We tackle the dynamic user-item relation learning with the awareness of multi-behavior interactive patterns. We propose a new Temporal Graph Transformer (TGT) recommendation framework to jointly capture dynamic short-term and long-range user-item interactive patterns.
arXiv Detail & Related papers (2022-06-06T15:42:54Z)
Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation [22.701371886522494]
We argue that the latent semantic item-item structures underlying multimodal contents could be beneficial for learning better item representations. We devise a novel modality-aware structure learning module, which learns item-item relationships for each modality.
arXiv Detail & Related papers (2021-11-01T03:37:02Z)
Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems. KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics. We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z)
Mining Latent Structures for Multimedia Recommendation [46.70109406399858]
We propose a LATent sTructure mining method for multImodal reCommEndation, which we term LATTICE for brevity. We learn item-item structures for each modality and aggregates multiple modalities to obtain latent item graphs. Based on the learned latent graphs, we perform graph convolutions to explicitly inject high-order item affinities into item representations.
arXiv Detail & Related papers (2021-04-19T03:50:24Z)
Sparse-Interest Network for Sequential Recommendation [78.83064567614656]
We propose a novel textbfSparse textbfInterest textbfNEtwork (SINE) for sequential recommendation. Our sparse-interest module can adaptively infer a sparse set of concepts for each user from the large concept pool. SINE can achieve substantial improvement over state-of-the-art methods.
arXiv Detail & Related papers (2021-02-18T11:03:48Z)
Controllable Multi-Interest Framework for Recommendation [64.30030600415654]
We formalize the recommender system as a sequential recommendation problem. We propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our framework has been successfully deployed on the offline Alibaba distributed cloud platform.
arXiv Detail & Related papers (2020-05-19T10:18:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.