Supporting Sensemaking of Large Language Model Outputs at Scale
- URL: http://arxiv.org/abs/2401.13726v1
- Date: Wed, 24 Jan 2024 18:45:34 GMT
- Title: Supporting Sensemaking of Large Language Model Outputs at Scale
- Authors: Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld,
Elena L. Glassman
- Abstract summary: Large language models (LLMs) are capable of generating multiple responses to a single prompt.
We design five features, which include both pre-existing and novel methods for computing similarities and differences across textual documents.
We find that the features support a wide variety of sensemaking tasks and even make tasks previously considered to be too difficult by our participants now tractable.
- Score: 21.763460834412776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) are capable of generating multiple responses to
a single prompt, yet little effort has been expended to help end-users or
system designers make use of this capability. In this paper, we explore how to
present many LLM responses at once. We design five features, which include both
pre-existing and novel methods for computing similarities and differences
across textual documents, as well as how to render their outputs. We report on
a controlled user study (n=24) and eight case studies evaluating these features
and how they support users in different tasks. We find that the features
support a wide variety of sensemaking tasks and even make tasks previously
considered to be too difficult by our participants now tractable. Finally, we
present design guidelines to inform future explorations of new LLM interfaces.
Related papers
- Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models [22.676688441884465]
Fine-tuning pre-trained large language models (LLMs) on a diverse array of tasks has become a common approach for building models.
This study investigates the task-specific information encoded in pre-trained LLMs and the effects of instruction tuning on their representations.
arXiv Detail & Related papers (2024-10-25T23:38:28Z) - A Survey of Prompt Engineering Methods in Large Language Models for Different NLP Tasks [0.0]
Large language models (LLMs) have shown remarkable performance on many different Natural Language Processing (NLP) tasks.
Prompt engineering plays a key role in adding more to the already existing abilities of LLMs to achieve significant performance gains.
This paper summarizes different prompting techniques and club them together based on different NLP tasks that they have been used for.
arXiv Detail & Related papers (2024-07-17T20:23:19Z) - Tool Learning with Large Language Models: A Survey [60.733557487886635]
Tool learning with large language models (LLMs) has emerged as a promising paradigm for augmenting the capabilities of LLMs to tackle highly complex problems.
Despite growing attention and rapid advancements in this field, the existing literature remains fragmented and lacks systematic organization.
arXiv Detail & Related papers (2024-05-28T08:01:26Z) - Benchmarking LLMs on the Semantic Overlap Summarization Task [9.656095701778975]
This paper comprehensively evaluates Large Language Models (LLMs) on the Semantic Overlap Summarization (SOS) task.
We report well-established metrics like ROUGE, BERTscore, and SEM-F1$ on two different datasets of alternative narratives.
arXiv Detail & Related papers (2024-02-26T20:33:50Z) - INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [59.07490387145391]
Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks.
Their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.
We introduce a novel instruction tuning dataset, INTERS, encompassing 20 tasks across three fundamental IR categories.
arXiv Detail & Related papers (2024-01-12T12:10:28Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Beyond Text: Unveiling Multimodal Proficiency of Large Language Models
with MultiAPI Benchmark [11.572835837392867]
This study introduces MultiAPI, a pioneering comprehensive large-scale API benchmark dataset.
It consists of 235 diverse API calls and 2,038 contextual prompts, offering a unique platform evaluation of tool-augmented LLMs handling multimodal tasks.
Our findings reveal that while LLMs demonstrate proficiency in API call decision-making, they face challenges in domain identification, function selection, and argument generation.
arXiv Detail & Related papers (2023-11-21T23:26:05Z) - SEMQA: Semi-Extractive Multi-Source Question Answering [94.04430035121136]
We introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion.
We create the first dataset of this kind, QuoteSum, with human-written semi-extractive answers to natural and generated questions.
arXiv Detail & Related papers (2023-11-08T18:46:32Z) - Frugal Prompting for Dialog Models [17.048111072193933]
This study examines different approaches for building dialog systems using large language models (LLMs)
As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context.
The research also analyzes the representations of dialog history that have the optimal usable-information density.
arXiv Detail & Related papers (2023-05-24T09:06:49Z) - RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit.
Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets.
Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z) - Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A
Preliminary Study on Writing Assistance [60.40541387785977]
Small foundational models can display remarkable proficiency in tackling diverse tasks when fine-tuned using instruction-driven data.
In this work, we investigate a practical problem setting where the primary focus is on one or a few particular tasks rather than general-purpose instruction following.
Experimental results show that fine-tuning LLaMA on writing instruction data significantly improves its ability on writing tasks.
arXiv Detail & Related papers (2023-05-22T16:56:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.