Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews
- URL: http://arxiv.org/abs/2502.15226v1
- Date: Fri, 21 Feb 2025 05:42:22 GMT
- Title: Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews
- Authors: Mengqiao Liu, Tevin Wang, Cassandra A. Cohen, Sarah Li, Chenyan Xiong,
- Abstract summary: This paper presents CLUE, an LLM-powered interviewer that conducts in-the-moment user experience interviews.<n>We conduct a study with thousands of users to understand user opinions on mainstream LLMs.<n>Our experiments demonstrate that CLUE captures interesting user opinions, for example, the bipolar views on the displayed reasoning process of DeepSeek-R1.
- Score: 21.600423558370533
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Which large language model (LLM) is better? Every evaluation tells a story, but what do users really think about current LLMs? This paper presents CLUE, an LLM-powered interviewer that conducts in-the-moment user experience interviews, right after users interacted with LLMs, and automatically gathers insights about user opinions from massive interview logs. We conduct a study with thousands of users to understand user opinions on mainstream LLMs, recruiting users to first chat with a target LLM and then interviewed by CLUE. Our experiments demonstrate that CLUE captures interesting user opinions, for example, the bipolar views on the displayed reasoning process of DeepSeek-R1 and demands for information freshness and multi-modality. Our collected chat-and-interview logs will be released.
Related papers
- Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale [51.9706400130481]
Large Language Models (LLMs) have emerged as personalized assistants for users across a wide range of tasks.
PERSONAMEM features curated user profiles with over 180 simulated user-LLM interaction histories.
We evaluate LLM chatbots' ability to identify the most suitable response according to the current state of the user's profile.
arXiv Detail & Related papers (2025-04-19T08:16:10Z) - GuideLLM: Exploring LLM-Guided Conversation with Applications in Autobiography Interviewing [73.8469700907927]
Large Language Models (LLMs) succeed in human-guided conversations such as instruction following and question answering.<n>In this study, we first characterize LLM-guided conversation into three fundamental components: Goal Navigation; (ii) Context Management; (iii) Empathetic Engagement.<n>We compare GuideLLM with 6 state-of-the-art LLMs such as GPT-4o and Llama-3-70b-Instruct, from the perspective of interviewing quality, and autobiography generation quality.
arXiv Detail & Related papers (2025-02-10T14:11:32Z) - LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation [24.103034843158717]
We introduce LLM-as-an-Interviewer, a novel paradigm for evaluating large language models (LLMs)<n>This approach leverages multi-turn interactions where the interviewer actively provides feedback on responses and poses follow-up questions to the evaluated LLM.<n>We apply the framework to evaluate six models on the MATH and DepthQA tasks.
arXiv Detail & Related papers (2024-12-10T15:00:32Z) - NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews [65.35458530702442]
We focus on journalistic interviews, a domain rich in grounding communication and abundant in data.
We curate a dataset of 40,000 two-person informational interviews from NPR and CNN.
LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.
arXiv Detail & Related papers (2024-11-21T01:37:38Z) - Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization [33.513689684998035]
The concept of persona, originally adopted in dialogue literature, has re-surged as a promising framework for tailoring large language models to specific context.
To close the gap, we present a comprehensive survey to categorize the current state of the field.
arXiv Detail & Related papers (2024-06-03T10:08:23Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Aligning Language Models to User Opinions [10.953326025836475]
We find that the opinions of a user and their demographics and ideologies are not mutual predictors.
We use this insight to align LLMs by modeling both user opinions as well as user demographics and ideology.
In addition to the typical approach of prompting LLMs with demographics and ideology, we discover that utilizing the most relevant past opinions from individual users enables the model to predict user opinions more accurately.
arXiv Detail & Related papers (2023-05-24T09:11:11Z) - Low-code LLM: Graphical User Interface over Large Language Models [115.08718239772107]
This paper introduces a novel human-LLM interaction framework, Low-code LLM.
It incorporates six types of simple low-code visual programming interactions to achieve more controllable and stable responses.
We highlight three advantages of the low-code LLM: user-friendly interaction, controllable generation, and wide applicability.
arXiv Detail & Related papers (2023-04-17T09:27:40Z) - Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT)
This paper systematically investigates the advantages and challenges of LLMs for MMT.
We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z) - Check Your Facts and Try Again: Improving Large Language Models with
External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks.
This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.