BloomIntent: Automating Search Evaluation with LLM-Generated Fine-Grained User Intents
- URL: http://arxiv.org/abs/2509.18641v1
- Date: Tue, 23 Sep 2025 04:56:06 GMT
- Title: BloomIntent: Automating Search Evaluation with LLM-Generated Fine-Grained User Intents
- Authors: Yoonseo Choi, Eunhye Kim, Hyunwoo Kim, Donghyun Park, Honggu Lee, Jinyoung Kim, Juho Kim,
- Abstract summary: BloomIntent is a user-centric search evaluation method that uses user intents as the evaluation unit.<n>We show that BloomIntent generated fine-grained, evaluable intents and produced scalable assessments of intent-level satisfaction.
- Score: 21.802731368326132
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: If 100 people issue the same search query, they may have 100 different goals. While existing work on user-centric AI evaluation highlights the importance of aligning systems with fine-grained user intents, current search evaluation methods struggle to represent and assess this diversity. We introduce BloomIntent, a user-centric search evaluation method that uses user intents as the evaluation unit. BloomIntent first generates a set of plausible, fine-grained search intents grounded on taxonomies of user attributes and information-seeking intent types. Then, BloomIntent provides an automated evaluation of search results against each intent powered by large language models. To support practical analysis, BloomIntent clusters semantically similar intents and summarizes evaluation outcomes in a structured interface. With three technical evaluations, we showed that BloomIntent generated fine-grained, evaluable, and realistic intents and produced scalable assessments of intent-level satisfaction that achieved 72% agreement with expert evaluators. In a case study (N=4), we showed that BloomIntent supported search specialists in identifying intents for ambiguous queries, uncovering underserved user needs, and discovering actionable insights for improving search experiences. By shifting from query-level to intent-level evaluation, BloomIntent reimagines how search systems can be assessed -- not only for performance but for their ability to serve a multitude of user goals.
Related papers
- DiscoverLLM: From Executing Intents to Discovering Them [30.142994019166796]
We introduce DiscoverLLM, a framework that trains Large Language Models to help users form and discover intents.<n>Resulting models learn to collaborate with users by adaptively diverging (i.e., explore options) when intents are unclear.<n>In a user study with 75 human participants, DiscoverLLM improved conversation satisfaction and efficiency compared to baselines.
arXiv Detail & Related papers (2026-02-03T11:51:46Z) - QUIDS: Query Intent Description for Exploratory Search via Dual Space Modeling [28.960030628126137]
In exploratory search, users often submit vague queries to investigate unfamiliar topics.<n>This leads to a self-reinforcing cycle of mismatched results and trial-and-error reformulation.<n>We propose QUIDS, a method that generates user-facing natural language query intent descriptions.
arXiv Detail & Related papers (2024-10-16T09:28:58Z) - IntentRec: Predicting User Session Intent with Hierarchical Multi-Task Learning [2.209382468269059]
We introduce IntentRec, a novel recommendation framework based on hierarchical multi-task neural network architecture.<n>By directly leveraging the intent prediction, we can offer accurate and personalized recommendations to users.<n>Our comprehensive experiments on Netflix user engagement data show that IntentRec outperforms the state-of-the-art next-item and next-intent predictors.
arXiv Detail & Related papers (2024-07-25T22:58:59Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Hierarchical Indexing for Retrieval-Augmented Opinion Summarization [60.5923941324953]
We propose a method for unsupervised abstractive opinion summarization that combines the attributability and scalability of extractive approaches with the coherence and fluency of Large Language Models (LLMs)
Our method, HIRO, learns an index structure that maps sentences to a path through a semantically organized discrete hierarchy.
At inference time, we populate the index and use it to identify and retrieve clusters of sentences containing popular opinions from input reviews.
arXiv Detail & Related papers (2024-03-01T10:38:07Z) - I3: Intent-Introspective Retrieval Conditioned on Instructions [83.91776238599824]
I3 is a unified retrieval system that performs Intent-Introspective retrieval across various tasks conditioned on Instructions without task-specific training.
I3 incorporates a pluggable introspector in a parameter-isolated manner to comprehend specific retrieval intents.
It utilizes extensive LLM-generated data to train I3 phase-by-phase, embodying two key designs: progressive structure pruning and drawback-based data refinement.
arXiv Detail & Related papers (2023-08-19T14:17:57Z) - Discovering New Intents Using Latent Variables [51.50374666602328]
We propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables.
In E-step, we conduct discovering intents and explore the intrinsic structure of unlabeled data by the posterior of intent assignments.
In M-step, we alleviate the forgetting of prior knowledge transferred from known intents by optimizing the discrimination of labeled data.
arXiv Detail & Related papers (2022-10-21T08:29:45Z) - Saliency Cards: A Framework to Characterize and Compare Saliency Methods [34.38335172204263]
Saliency methods calculate how important each input feature is to a model's output.
Existing approaches assume universal desiderata for saliency methods that do not account for diverse user needs.
We introduce saliency cards: structured documentation of how saliency methods operate and their performance.
arXiv Detail & Related papers (2022-06-07T01:21:49Z) - Deep Search Query Intent Understanding [17.79430887321982]
This paper aims to provide a comprehensive learning framework for modeling query intent under different stages of a search.
We focus on the design for 1) predicting users' intents as they type in queries on-the-fly in typeahead search using character-level models; and 2) accurate word-level intent prediction models for complete queries.
arXiv Detail & Related papers (2020-08-15T18:19:56Z) - Query Intent Detection from the SEO Perspective [0.34376560669160383]
We aim to identify the user query's intent by taking advantage of Google results and machine learning methods.
A list of keywords extracted from the clustered queries is used to identify the intent of a new given query.
arXiv Detail & Related papers (2020-06-16T13:08:29Z) - Learning to Rank Intents in Voice Assistants [2.102846336724103]
We propose a novel Energy-based model for the intent ranking task.
We show our approach outperforms existing state of the art methods by reducing the error-rate by 3.8%.
We also evaluate the robustness of our algorithm on the intent ranking task and show our algorithm improves the robustness by 33.3%.
arXiv Detail & Related papers (2020-04-30T21:51:26Z) - IART: Intent-aware Response Ranking with Transformers in
Information-seeking Conversation Systems [80.0781718687327]
We analyze user intent patterns in information-seeking conversations and propose an intent-aware neural response ranking model "IART"
IART is built on top of the integration of user intent modeling and language representation learning with the Transformer architecture.
arXiv Detail & Related papers (2020-02-03T05:59:52Z) - Deep Learning for Person Re-identification: A Survey and Outlook [233.36948173686602]
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras.
By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings.
arXiv Detail & Related papers (2020-01-13T12:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.