Related papers: Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models

Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models

URL: http://arxiv.org/abs/2503.16734v1
Date: Thu, 20 Mar 2025 22:37:15 GMT
Title: Towards Agentic Recommender Systems in the Era of Multimodal Large Language Models
Authors: Chengkai Huang, Junda Wu, Yu Xia, Zixu Yu, Ruhan Wang, Tong Yu, Ruiyi Zhang, Ryan A. Rossi, Branislav Kveton, Dongruo Zhou, Julian McAuley, Lina Yao,
Abstract summary: Recent breakthroughs in Large Language Models (LLMs) have led to the emergence of agentic AI systems.<n>LLM-based Agentic RS (LLM-ARS) can offer more interactive, context-aware, and proactive recommendations.
Score: 75.4890331763196
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recent breakthroughs in Large Language Models (LLMs) have led to the emergence of agentic AI systems that extend beyond the capabilities of standalone models. By empowering LLMs to perceive external environments, integrate multimodal information, and interact with various tools, these agentic systems exhibit greater autonomy and adaptability across complex tasks. This evolution brings new opportunities to recommender systems (RS): LLM-based Agentic RS (LLM-ARS) can offer more interactive, context-aware, and proactive recommendations, potentially reshaping the user experience and broadening the application scope of RS. Despite promising early results, fundamental challenges remain, including how to effectively incorporate external knowledge, balance autonomy with controllability, and evaluate performance in dynamic, multimodal settings. In this perspective paper, we first present a systematic analysis of LLM-ARS: (1) clarifying core concepts and architectures; (2) highlighting how agentic capabilities -- such as planning, memory, and multimodal reasoning -- can enhance recommendation quality; and (3) outlining key research questions in areas such as safety, efficiency, and lifelong personalization. We also discuss open problems and future directions, arguing that LLM-ARS will drive the next wave of RS innovation. Ultimately, we foresee a paradigm shift toward intelligent, autonomous, and collaborative recommendation experiences that more closely align with users' evolving needs and complex decision-making processes.

Related papers

A Desideratum for Conversational Agents: Capabilities, Challenges, and Future Directions [51.96890647837277]
Large Language Models (LLMs) have propelled conversational AI from traditional dialogue systems into sophisticated agents capable of autonomous actions, contextual awareness, and multi-turn interactions with users. This survey paper presents a desideratum for next-generation Conversational Agents - what has been achieved, what challenges persist, and what must be done for more scalable systems that approach human-level intelligence.
arXiv Detail & Related papers (2025-04-07T21:01:25Z)
LLMs Working in Harmony: A Survey on the Technological Aspects of Building Effective LLM-Based Multi Agent Systems [0.0]
This survey investigates foundational technologies essential for developing effective Large Language Model (LLM)-based multi-agent systems. Aiming to answer how best to optimize these systems for collaborative, dynamic environments, we focus on four critical areas: Architecture, Memory, Planning, and Technologies/ Frameworks.
arXiv Detail & Related papers (2025-03-13T06:17:50Z)
Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents. We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning. We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z)
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems [11.522282769053817]
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in reasoning, planning, and decision-making.<n>Researchers have begun incorporating LLMs into multi-agent systems to tackle tasks beyond the scope of single-agent setups.<n>This survey serves as a catalyst for further innovation, fostering more robust, scalable, and intelligent multi-agent systems.
arXiv Detail & Related papers (2025-02-20T07:18:34Z)
Position: Towards a Responsible LLM-empowered Multi-Agent Systems [22.905804138387854]
The rise of Agent AI and Large Language Model-powered Multi-Agent Systems (LLM-MAS) has underscored the need for responsible and dependable system operation.<n>These advancements introduce critical challenges: LLM agents exhibit inherent unpredictability, and uncertainties in their outputs can compound, threatening system stability.<n>To address these risks, a human-centered design approach with active dynamic moderation is essential.
arXiv Detail & Related papers (2025-02-03T16:04:30Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
IM-RAG: Multi-Round Retrieval-Augmented Generation Through Learning Inner Monologues [10.280113107290067]
The IM-RAG approach integrates Information Retrieval systems with Large Language Models (LLMs) to support multi-round RAG. The entire IM process is optimized via Reinforcement Learning (RL) where a Progress Tracker is incorporated to provide mid-step rewards. The results show that our approach achieves state-of-the-art (SOTA) performance while providing high flexibility in integrating IR modules.
arXiv Detail & Related papers (2024-05-15T12:41:20Z)
LLM-Based Multi-Agent Systems for Software Engineering: Literature Review, Vision and the Road Ahead [14.834072370183106]
This paper explores the transformative potential of integrating Large Language Models into Multi-Agent (LMA) systems.<n>By leveraging the collaborative and specialized abilities of multiple agents, LMA systems enable autonomous problem-solving, improve robustness, and provide scalable solutions for managing the complexity of real-world software projects.
arXiv Detail & Related papers (2024-04-07T07:05:40Z)
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent [50.508669199496474]
We develop a ReAct-style LLM agent with the ability to reason and act upon external knowledge. We refine the agent through a ReST-like method that iteratively trains on previous trajectories. Starting from a prompted large model and after just two iterations of the algorithm, we can produce a fine-tuned small model.
arXiv Detail & Related papers (2023-12-15T18:20:15Z)
Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach [28.477463632107558]
We develop a modular framework called LLaMAC to address hallucination in Large Language Models and coordination in Multi-Agent Systems. LLaMAC implements a value distribution encoding similar to that found in the human brain, utilizing internal and external feedback mechanisms to facilitate collaboration and iterative reasoning among its modules.
arXiv Detail & Related papers (2023-11-23T10:14:58Z)
Recommender Systems in the Era of Large Language Models (LLMs) [62.0129013439038]
Large Language Models (LLMs) have revolutionized the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI) We conduct a comprehensive review of LLM-empowered recommender systems from various aspects including Pre-training, Fine-tuning, and Prompting.
arXiv Detail & Related papers (2023-07-05T06:03:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.