TalkPlay-Tools: Conversational Music Recommendation with LLM Tool Calling
- URL: http://arxiv.org/abs/2510.01698v3
- Date: Wed, 08 Oct 2025 05:49:57 GMT
- Title: TalkPlay-Tools: Conversational Music Recommendation with LLM Tool Calling
- Authors: Seungheon Doh, Keunwoo Choi, Juhan Nam,
- Abstract summary: We propose a music recommendation system with tool calling to serve as a unified retrieval-reranking pipeline.<n>Our system positions an LLM as an end-to-end recommendation system that interprets user intent.<n>We demonstrate that this unified tool-calling framework achieves competitive performance across diverse recommendation scenarios.
- Score: 20.889365999166813
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: While the recent developments in large language models (LLMs) have successfully enabled generative recommenders with natural language interactions, their recommendation behavior is limited, leaving other simpler yet crucial components such as metadata or attribute filtering underutilized in the system. We propose an LLM-based music recommendation system with tool calling to serve as a unified retrieval-reranking pipeline. Our system positions an LLM as an end-to-end recommendation system that interprets user intent, plans tool invocations, and orchestrates specialized components: boolean filters (SQL), sparse retrieval (BM25), dense retrieval (embedding similarity), and generative retrieval (semantic IDs). Through tool planning, the system predicts which types of tools to use, their execution order, and the arguments needed to find music matching user preferences, supporting diverse modalities while seamlessly integrating multiple database filtering methods. We demonstrate that this unified tool-calling framework achieves competitive performance across diverse recommendation scenarios by selectively employing appropriate retrieval methods based on user queries, envisioning a new paradigm for conversational music recommendation systems.
Related papers
- WeMusic-Agent: Efficient Conversational Music Recommendation via Knowledge Internalization and Agentic Boundary Learning [12.737364415781805]
This paper proposes WeMusic-Agent, a training framework for efficient conversational music recommendation.<n>We present WeMusic-Agent-M1, an agentic model that internalizes extensive musical knowledge via continued pretraining on 50B music-related corpus.<n>We also construct a benchmark for personalized music recommendations derived from real-world data in WeChat Listen.
arXiv Detail & Related papers (2025-12-18T02:59:19Z) - TALKPLAY: Multimodal Music Recommendation with Large Language Models [6.830154140450626]
We present TALKPLAY, a novel multimodal music recommendation system that reformulates recommendation as a token generation problem using large language models (LLMs)<n>Our system effectively recommends music from diverse user queries while generating contextually relevant responses.<n>Our qualitative and quantitative evaluation demonstrates that TALKPLAY significantly outperforms unimodal approaches based solely on text or listening history in both recommendation performance and conversational naturalness.
arXiv Detail & Related papers (2025-02-19T13:28:20Z) - Improving Tool Retrieval by Leveraging Large Language Models for Query Generation [16.7926347207647]
In-context learning can provide a short list of relevant tools in the prompt.<n>We propose leveraging Large Language Models (LLMs) to generate a retrieval query.<n>The generated query is embedded and used to find the most relevant tools via a nearest-neighbor search.
arXiv Detail & Related papers (2024-11-17T03:02:09Z) - Laser: Parameter-Efficient LLM Bi-Tuning for Sequential Recommendation with Collaborative Information [76.62949982303532]
We propose a parameter-efficient Large Language Model Bi-Tuning framework for sequential recommendation with collaborative information (Laser)
In our Laser, the prefix is utilized to incorporate user-item collaborative information and adapt the LLM to the recommendation task, while the suffix converts the output embeddings of the LLM from the language space to the recommendation space for the follow-up item recommendation.
M-Former is a lightweight MoE-based querying transformer that uses a set of query experts to integrate diverse user-specific collaborative information encoded by frozen ID-based sequential recommender systems.
arXiv Detail & Related papers (2024-09-03T04:55:03Z) - Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning [57.523454568002144]
Large language models (LLMs) have shown capabilities in commonsense reasoning and leveraging external tools.
We introduce ToolRec, a framework for LLM-empowered recommendations via tool learning.
We formulate the recommendation process as a process aimed at exploring user interests in attribute granularity.
We consider two types of attribute-oriented tools: rank tools and retrieval tools.
arXiv Detail & Related papers (2024-05-24T00:06:54Z) - MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use [79.87054552116443]
Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities.<n>We introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools.<n>We conduct experiments involving eight popular LLMs and find that the majority of them still struggle to effectively select tools.
arXiv Detail & Related papers (2023-10-04T19:39:26Z) - Recommender AI Agent: Integrating Large Language Models for Interactive
Recommendations [53.76682562935373]
We introduce an efficient framework called textbfInteRecAgent, which employs LLMs as the brain and recommender models as tools.
InteRecAgent achieves satisfying performance as a conversational recommender system, outperforming general-purpose LLMs.
arXiv Detail & Related papers (2023-08-31T07:36:44Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - Beyond Single Items: Exploring User Preferences in Item Sets with the
Conversational Playlist Curation Dataset [20.42354123651454]
We call this task conversational item set curation.
We present a novel data collection methodology that efficiently collects realistic preferences about item sets in a conversational setting.
We show that it leads raters to express preferences that would not be otherwise expressed.
arXiv Detail & Related papers (2023-03-13T00:39:04Z) - Talk the Walk: Synthetic Data Generation for Conversational Music
Recommendation [62.019437228000776]
We present TalkWalk, which generates realistic high-quality conversational data by leveraging encoded expertise in widely available item collections.
We generate over one million diverse conversations in a human-collected dataset.
arXiv Detail & Related papers (2023-01-27T01:54:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.