DataTales: Investigating the use of Large Language Models for Authoring
Data-Driven Articles
- URL: http://arxiv.org/abs/2308.04076v1
- Date: Tue, 8 Aug 2023 06:21:58 GMT
- Title: DataTales: Investigating the use of Large Language Models for Authoring
Data-Driven Articles
- Authors: Nicole Sultanum, Arjun Srinivasan
- Abstract summary: Large language models (LLMs) present an opportunity to assist the authoring of data-driven articles.
We designed a prototype system, DataTales, that leverages a LLM to generate textual narratives accompanying a given chart.
Using DataTales as a design probe, we conducted a qualitative study with 11 professionals to evaluate the concept.
- Score: 19.341156634212364
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Authoring data-driven articles is a complex process requiring authors to not
only analyze data for insights but also craft a cohesive narrative that
effectively communicates the insights. Text generation capabilities of
contemporary large language models (LLMs) present an opportunity to assist the
authoring of data-driven articles and expedite the writing process. In this
work, we investigate the feasibility and perceived value of leveraging LLMs to
support authors of data-driven articles. We designed a prototype system,
DataTales, that leverages a LLM to generate textual narratives accompanying a
given chart. Using DataTales as a design probe, we conducted a qualitative
study with 11 professionals to evaluate the concept, from which we distilled
affordances and opportunities to further integrate LLMs as valuable data-driven
article authoring assistants.
Related papers
- AvaTaR: Optimizing LLM Agents for Tool-Assisted Knowledge Retrieval [93.96463520716759]
Large language model (LLM) agents have demonstrated impressive capability in utilizing external tools and knowledge to boost accuracy and reduce hallucinations.
Here, we introduce AvaTaR, a novel framework that optimize an LLM agent to effectively use the provided tools and improve its performance on a given task/domain.
We find AvaTaR consistently outperforms state-of-the-art approaches across all four challenging tasks and exhibits strong generalization ability when applied to novel cases.
arXiv Detail & Related papers (2024-06-17T04:20:02Z) - Leveraging Large Language Models for Web Scraping [0.0]
This research investigates a general-purpose accurate data scraping recipe for RAG models designed for language generation.
To capture knowledge in a more modular and interpretable way, we use pre trained language models with a latent knowledge retriever.
arXiv Detail & Related papers (2024-06-12T14:15:15Z) - Large Language Models for Data Annotation: A Survey [49.8318827245266]
The emergence of advanced Large Language Models (LLMs) presents an unprecedented opportunity to automate the complicated process of data annotation.
This survey includes an in-depth taxonomy of data types that LLMs can annotate, a review of learning strategies for models utilizing LLM-generated annotations, and a detailed discussion of the primary challenges and limitations associated with using LLMs for data annotation.
arXiv Detail & Related papers (2024-02-21T00:44:04Z) - Can Large Language Models Serve as Data Analysts? A Multi-Agent Assisted
Approach for Qualitative Data Analysis [6.592797748561459]
Large Language Models (LLMs) have enabled collaborative human-bot interactions in Software Engineering (SE)
We introduce a new dimension of scalability and accuracy in qualitative research, potentially transforming data interpretation methodologies in SE.
arXiv Detail & Related papers (2024-02-02T13:10:46Z) - Large Language Models for Conducting Advanced Text Analytics Information Systems Research [4.913568041651961]
Large Language Models (LLMs) have emerged as tools that are capable of processing and extracting insights from massive unstructured textual datasets.
We propose a Text Analytics for Information Systems Research (TAISR) framework to assist the Information Systems community in understanding how to operationalize LLMs.
arXiv Detail & Related papers (2023-12-27T19:49:00Z) - LLM-in-the-loop: Leveraging Large Language Model for Thematic Analysis [18.775126929754833]
Thematic analysis (TA) has been widely used for analyzing qualitative data in many disciplines and fields.
Human coders develop and deepen their data interpretation and coding over multiple iterations, making TA labor-intensive and time-consuming.
We propose a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct TA with in-context learning (ICL)
arXiv Detail & Related papers (2023-10-23T17:05:59Z) - WASA: WAtermark-based Source Attribution for Large Language
Model-Generated Data [60.759755177369364]
Large language models (LLMs) generate synthetic texts with embedded watermarks that contain information about their source(s)
We propose a WAtermarking for Source Attribution (WASA) framework that satisfies key properties due to our algorithmic designs.
Our framework achieves effective source attribution and data provenance.
arXiv Detail & Related papers (2023-10-01T12:02:57Z) - Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications.
We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
arXiv Detail & Related papers (2023-08-21T15:35:16Z) - Demonstration of InsightPilot: An LLM-Empowered Automated Data
Exploration System [48.62158108517576]
We introduce InsightPilot, an automated data exploration system designed to simplify the data exploration process.
InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining.
In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts.
arXiv Detail & Related papers (2023-04-02T07:27:49Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.