Related papers: Human-Centered LLM-Agent User Interface: A Position Paper

Human-Centered LLM-Agent User Interface: A Position Paper

URL: http://arxiv.org/abs/2405.13050v2
Date: Mon, 23 Sep 2024 16:41:04 GMT
Title: Human-Centered LLM-Agent User Interface: A Position Paper
Authors: Daniel Chin, Yuxuan Wang, Gus Xia,
Abstract summary: Large Language Model (LLM) -in-the-loop applications have been shown to effectively interpret the human user's commands. A user mostly ignorant to the underlying tools/systems should be able to work with a LAUI to discover an emergent workflow.
Score: 8.675534401018407
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Large Language Model (LLM) -in-the-loop applications have been shown to effectively interpret the human user's commands, make plans, and operate external tools/systems accordingly. Still, the operation scope of the LLM agent is limited to passively following the user, requiring the user to frame his/her needs with regard to the underlying tools/systems. We note that the potential of an LLM-Agent User Interface (LAUI) is much greater. A user mostly ignorant to the underlying tools/systems should be able to work with a LAUI to discover an emergent workflow. Contrary to the conventional way of designing an explorable GUI to teach the user a predefined set of ways to use the system, in the ideal LAUI, the LLM agent is initialized to be proficient with the system, proactively studies the user and his/her needs, and proposes new interaction schemes to the user. To illustrate LAUI, we present Flute X GPT, a concrete example using an LLM agent, a prompt manager, and a flute-tutoring multi-modal software-hardware system to facilitate the complex, real-time user experience of learning to play the flute.

Related papers

Creating General User Models from Computer Use [62.91116265732001]
This paper presents an architecture for a general user model (GUM) that learns about you by observing any interaction you have with your computer.<n>The GUM takes as input any unstructured observation of a user (e.g., device screenshots) and constructs confidence-weighted propositions that capture user knowledge and preferences.
arXiv Detail & Related papers (2025-05-16T04:00:31Z)
Towards Machine-Generated Code for the Resolution of User Intentions [2.762180345826837]
We investigate the feasibility of generating and collaborating through code generation that results from prompting an LLM with a concrete user intention. We provide in-depth analysis and comparison of various user intentions, the resulting code, and its execution. The employed LLM, GPT-4o-mini, exhibits remarkable proficiency in the generation of code-oriented in accordance with provided user intentions.
arXiv Detail & Related papers (2025-04-24T13:19:17Z)
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction [15.220300812671494]
We introduce ScreenLLM, a set of multimodal large language models (MLLMs) tailored for advanced UI understanding and action prediction. Our work lays the foundation for scalable, robust, and intelligent GUI agents that enhance user interaction in diverse software environments.
arXiv Detail & Related papers (2025-03-26T20:41:24Z)
Learning to Ask: When LLMs Meet Unclear Instruction [49.256630152684764]
Large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone. We evaluate the performance of LLMs tool-use under imperfect instructions, analyze the error patterns, and build a challenging tool-use benchmark called Noisy ToolBench. We propose a novel framework, Ask-when-Needed (AwN), which prompts LLMs to ask questions to users whenever they encounter obstacles due to unclear instructions.
arXiv Detail & Related papers (2024-08-31T23:06:12Z)
Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning [57.523454568002144]
Large language models (LLMs) have shown capabilities in commonsense reasoning and leveraging external tools. We introduce ToolRec, a framework for LLM-empowered recommendations via tool learning. We formulate the recommendation process as a process aimed at exploring user interests in attribute granularity. We consider two types of attribute-oriented tools: rank tools and retrieval tools.
arXiv Detail & Related papers (2024-05-24T00:06:54Z)
User-LLM: Efficient LLM Contextualization with User Embeddings [23.226164112909643]
User-LLM is a novel framework that leverages user embeddings to directly contextualize large language models with user history interactions. Our approach achieves significant efficiency gains by representing user timelines directly as embeddings, leading to substantial inference speedups of up to 78.1X.
arXiv Detail & Related papers (2024-02-21T08:03:27Z)
LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-Explanations [26.340786701393768]
Interpretability tools that offer explanations in the form of a dialogue have demonstrated their efficacy in enhancing users' understanding. Current solutions for dialogue-based explanations, however, often require external tools and modules and are not easily transferable to tasks they were not designed for. We present an easily accessible tool that allows users to chat with any state-of-the-art large language model (LLM) about its behavior.
arXiv Detail & Related papers (2024-01-23T09:11:07Z)
MLLM-Tool: A Multimodal Large Language Model For Tool Agent Learning [38.610185966889226]
We propose MLLM-Tool, a system incorporating open-source large language models and multi-modal encoders. The learnt LLMs can be conscious of multi-modal input instruction and then select the function-matched tool correctly. Experiments reveal that our MLLM-Tool is capable of recommending appropriate tools for multi-modal instructions.
arXiv Detail & Related papers (2024-01-19T14:44:37Z)
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs. We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer. This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z)
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction. It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z)
Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations [53.76682562935373]
We introduce an efficient framework called textbfInteRecAgent, which employs LLMs as the brain and recommender models as tools. InteRecAgent achieves satisfying performance as a conversational recommender system, outperforming general-purpose LLMs.
arXiv Detail & Related papers (2023-08-31T07:36:44Z)
Low-code LLM: Graphical User Interface over Large Language Models [115.08718239772107]
This paper introduces a novel human-LLM interaction framework, Low-code LLM. It incorporates six types of simple low-code visual programming interactions to achieve more controllable and stable responses. We highlight three advantages of the low-code LLM: user-friendly interaction, controllable generation, and wide applicability.
arXiv Detail & Related papers (2023-04-17T09:27:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.