AgenticTagger: Structured Item Representation for Recommendation with LLM Agents
- URL: http://arxiv.org/abs/2602.05945v1
- Date: Thu, 05 Feb 2026 18:01:37 GMT
- Title: AgenticTagger: Structured Item Representation for Recommendation with LLM Agents
- Authors: Zhouhang Xie, Bo Peng, Zhankui He, Ziqi Chen, Alice Han, Isabella Ye, Benjamin Coleman, Noveen Sachdeva, Fernando Pereira, Julian McAuley, Wang-Cheng Kang, Derek Zhiyuan Cheng, Beidou Wang, Randolph Brown,
- Abstract summary: AgenticTagger is a framework that queries LLMs for representing items with sequences of text descriptors.<n>To effectively and efficiently ground vocabulary in the item corpus of interest, we design a multi-agent reflection mechanism.<n>Experiments on public and private data show AgenticTagger brings consistent improvements across diverse recommendation scenarios.
- Score: 58.12004213978182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-quality representations are a core requirement for effective recommendation. In this work, we study the problem of LLM-based descriptor generation, i.e., keyphrase-like natural language item representation generation frameworks with minimal constraints on downstream applications. We propose AgenticTagger, a framework that queries LLMs for representing items with sequences of text descriptors. However, open-ended generation provides little control over the generation space, leading to high cardinality, low-performance descriptors that renders downstream modeling challenging. To this end, AgenticTagger features two core stages: (1) a vocabulary building stage where a set of hierarchical, low-cardinality, and high-quality descriptors is identified, and (2) a vocabulary assignment stage where LLMs assign in-vocabulary descriptors to items. To effectively and efficiently ground vocabulary in the item corpus of interest, we design a multi-agent reflection mechanism where an architect LLM iteratively refines the vocabulary guided by parallelized feedback from annotator LLMs that validates the vocabulary against item data. Experiments on public and private data show AgenticTagger brings consistent improvements across diverse recommendation scenarios, including generative and term-based retrieval, ranking, and controllability-oriented, critique-based recommendation.
Related papers
- Unleashing the Native Recommendation Potential: LLM-Based Generative Recommendation via Structured Term Identifiers [51.64398574262054]
This paper introduces Term IDs (TIDs), defined as a set of semantically rich and standardized textual keywords, to serve as robust item identifiers.<n>We propose GRLM, a novel framework centered on TIDs, to convert item's metadata into standardized TIDs and utilize Integrative Instruction Fine-tuning to collaboratively optimize term internalization and sequential recommendation.
arXiv Detail & Related papers (2026-01-11T07:53:20Z) - LLMs as Sparse Retrievers:A Framework for First-Stage Product Search [103.70006474544364]
Product search is a crucial component of modern e-commerce platforms, with billions of user queries every day.<n>Sparse retrieval methods suffer from severe vocabulary mismatch issues, leading to suboptimal performance in product search scenarios.<n>With their potential for semantic analysis, large language models (LLMs) offer a promising avenue for mitigating vocabulary mismatch issues.<n>We propose PROSPER, a framework for PROduct search leveraging LLMs as SParsE Retrievers.
arXiv Detail & Related papers (2025-10-21T11:13:21Z) - Beyond Accuracy: Metrics that Uncover What Makes a 'Good' Visual Descriptor [4.76296755805531]
Descriptors are used in visual concept discovery and image classification with vision-based models (VLMs)<n>We systematically analyze descriptor quality along two key dimensions: (1) representational capacity, and (2) relationship with VLM pre-training data.<n>Motivated by ideas from representation alignment and language understanding, we introduce two alignment-based metrics--Global Alignment and CLIP Similarity--that move beyond accuracy.
arXiv Detail & Related papers (2025-07-04T12:50:04Z) - Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling [69.84963245729826]
We propose an auxiliary task of QL to enhance the backbone for subsequent contrastive learning of the retriever.<n>We introduce our model, which incorporates two key components: Attention Block (AB) and Document Corruption (DC)
arXiv Detail & Related papers (2025-04-07T16:03:59Z) - Text Clustering as Classification with LLMs [9.128151647718251]
We propose a novel framework that reframes text clustering as a classification task by harnessing the in-context learning capabilities of Large Language Models.<n>By leveraging the advanced natural language understanding and generalization capabilities of LLMs, the proposed approach enables effective clustering with minimal human intervention.<n> Experimental results on diverse datasets demonstrate that our framework achieves comparable or superior performance to state-of-the-art embedding-based clustering techniques.
arXiv Detail & Related papers (2024-09-30T16:57:34Z) - Making Large Language Models A Better Foundation For Dense Retrieval [19.38740248464456]
Dense retrieval needs to learn discriminative text embeddings to represent the semantic relationship between query and document.
It may benefit from the using of large language models (LLMs), given LLMs' strong capability on semantic understanding.
We propose LLaRA (LLM adapted for dense RetrievAl), which works as a post-hoc adaptation of dense retrieval application.
arXiv Detail & Related papers (2023-12-24T15:10:35Z) - Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information.
We propose an approach to distill the generated information during fine-tuning of self-supervised speech models.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z) - LLMs as Visual Explainers: Advancing Image Classification with Evolving
Visual Descriptions [13.546494268784757]
We propose a framework that integrates large language models (LLMs) and vision-language models (VLMs) to find the optimal class descriptors.
Our training-free approach develops an LLM-based agent with an evolutionary optimization strategy to iteratively refine class descriptors.
arXiv Detail & Related papers (2023-11-20T16:37:45Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.