Power-up! What Can Generative Models Do for Human Computation Workflows?
- URL: http://arxiv.org/abs/2307.02243v1
- Date: Wed, 5 Jul 2023 12:35:29 GMT
- Title: Power-up! What Can Generative Models Do for Human Computation Workflows?
- Authors: Garrett Allen, Gaole He, Ujwal Gadiraju
- Abstract summary: Investigation into large language models (LLMs) as part of crowdsourcing remains an under-explored space.
From an empirical standpoint, little is currently understood about how LLMs can improve the effectiveness of crowdsourcing.
- Score: 13.484359389266864
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We are amidst an explosion of artificial intelligence research, particularly
around large language models (LLMs). These models have a range of applications
across domains like medicine, finance, commonsense knowledge graphs, and
crowdsourcing. Investigation into LLMs as part of crowdsourcing workflows
remains an under-explored space. The crowdsourcing research community has
produced a body of work investigating workflows and methods for managing
complex tasks using hybrid human-AI methods. Within crowdsourcing, the role of
LLMs can be envisioned as akin to a cog in a larger wheel of workflows. From an
empirical standpoint, little is currently understood about how LLMs can improve
the effectiveness of crowdsourcing workflows and how such workflows can be
evaluated. In this work, we present a vision for exploring this gap from the
perspectives of various stakeholders involved in the crowdsourcing paradigm --
the task requesters, crowd workers, platforms, and end-users. We identify
junctures in typical crowdsourcing workflows at which the introduction of LLMs
can play a beneficial role and propose means to augment existing design
patterns for crowd work.
Related papers
- Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorFBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.
We also present WorFEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.
We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z) - WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence.
WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z) - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers.
WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform.
BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z) - Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - Designing LLM Chains by Adapting Techniques from Crowdsourcing Workflows [37.60760400107501]
LLM chains enable complex tasks by decomposing work into a sequence of subtasks.
Crowdsourcing addresses errors analogous to the way crowdsourcing address human error.
arXiv Detail & Related papers (2023-12-18T20:01:58Z) - Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration [83.4031923134958]
Corex is a suite of novel general-purpose strategies that transform Large Language Models into autonomous agents.
Inspired by human behaviors, Corex is constituted by diverse collaboration paradigms including Debate, Review, and Retrieve modes.
We demonstrate that orchestrating multiple LLMs to work in concert yields substantially better performance compared to existing methods.
arXiv Detail & Related papers (2023-09-30T07:11:39Z) - LLMs as Workers in Human-Computational Algorithms? Replicating
Crowdsourcing Pipelines with LLMs [25.4184470735779]
LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities.
We explore whether LLMs can replicate more complex crowdsourcing pipelines.
arXiv Detail & Related papers (2023-07-19T17:54:43Z) - Demystifying a Dark Art: Understanding Real-World Machine Learning Model
Development [2.422369741135428]
We analyze over 475k user-generated on OpenML, an open-source platform for tracking and sharing machine learning.
We find that users often adopt a manual, automated, or mixed approach when iterating on their iterations.
arXiv Detail & Related papers (2020-05-04T14:33:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.