CoLa -- Learning to Interactively Collaborate with Large LMs
- URL: http://arxiv.org/abs/2504.02965v2
- Date: Mon, 07 Apr 2025 01:08:58 GMT
- Title: CoLa -- Learning to Interactively Collaborate with Large LMs
- Authors: Abhishek Sharma, Dan Goldwasser,
- Abstract summary: We introduce CoLa, a novel self-guided learning paradigm for training automated guides.<n>We compare the strategies employed by humans and automated guides by conducting a human study on a QA dataset.<n>We show that automated guides outperform humans by adapting their strategies to reasoners' capabilities.
- Score: 22.423899484894477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LLMs' remarkable ability to tackle a wide range of language tasks opened new opportunities for collaborative human-AI problem solving. LLMs can amplify human capabilities by applying their intuitions and reasoning strategies at scale. We explore whether human guides can be simulated, by generalizing from human demonstrations of guiding an AI system to solve complex language problems. We introduce CoLa, a novel self-guided learning paradigm for training automated $\textit{guides}$ and evaluate it on two QA datasets, a puzzle-solving task, and a constrained text generation task. Our empirical results show that CoLa consistently outperforms competitive approaches across all domains. Moreover, a small-sized trained guide outperforms a strong model like GPT-4 when acting as a guide. We compare the strategies employed by humans and automated guides by conducting a human study on a QA dataset. We show that automated guides outperform humans by adapting their strategies to reasoners' capabilities and conduct qualitative analyses highlighting distinct differences in guiding strategies.
Related papers
- Training a Generally Curious Agent [86.84089201249104]
We present PAPRIKA, a fine-tuning approach that enables language models to develop general decision-making capabilities.
Experimental results show that models fine-tuned with PAPRIKA can effectively transfer their learned decision-making capabilities to entirely unseen tasks.
These results suggest a promising path towards AI systems that can autonomously solve novel sequential decision-making problems.
arXiv Detail & Related papers (2025-02-24T18:56:58Z) - The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models [0.0]
Large language models (LLMs) and generative AI have revolutionized natural language processing (NLP)
This chapter explores the transformative potential of LLMs in automated question generation and answer assessment.
arXiv Detail & Related papers (2024-10-12T15:54:53Z) - Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision [98.97575836717931]
Current AI alignment methodologies rely on human-provided demonstrations or judgments.<n>This raises a challenging research question: How can we keep improving the systems when their capabilities have surpassed the levels of humans?
arXiv Detail & Related papers (2024-03-14T15:12:38Z) - Can Foundation Models Watch, Talk and Guide You Step by Step to Make a
Cake? [62.59699229202307]
Despite advances in AI, it remains a significant challenge to develop interactive task guidance systems.
We created a new multimodal benchmark dataset, Watch, Talk and Guide (WTaG) based on natural interaction between a human user and a human instructor.
We leveraged several foundation models to study to what extent these models can be quickly adapted to perceptually enabled task guidance.
arXiv Detail & Related papers (2023-11-01T15:13:49Z) - Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration [83.4031923134958]
Corex is a suite of novel general-purpose strategies that transform Large Language Models into autonomous agents.
Inspired by human behaviors, Corex is constituted by diverse collaboration paradigms including Debate, Review, and Retrieve modes.
We demonstrate that orchestrating multiple LLMs to work in concert yields substantially better performance compared to existing methods.
arXiv Detail & Related papers (2023-09-30T07:11:39Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z) - Principle-Driven Self-Alignment of Language Models from Scratch with
Minimal Human Supervision [84.31474052176343]
Recent AI-assistant agents, such as ChatGPT, rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback to align the output with human intentions.
This dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision.
We propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision.
arXiv Detail & Related papers (2023-05-04T17:59:28Z) - Improving Grounded Language Understanding in a Collaborative Environment
by Interacting with Agents Through Help Feedback [42.19685958922537]
We argue that human-AI collaboration should be interactive, with humans monitoring the work of AI agents and providing feedback that the agent can understand and utilize.
In this work, we explore these directions using the challenging task defined by the IGLU competition, an interactive grounded language understanding task in a MineCraft-like world.
arXiv Detail & Related papers (2023-04-21T05:37:59Z) - Learning Complementary Policies for Human-AI Teams [22.13683008398939]
We propose a framework for a novel human-AI collaboration for selecting advantageous course of action.
Our solution aims to exploit the human-AI complementarity to maximize decision rewards.
arXiv Detail & Related papers (2023-02-06T17:22:18Z) - Human Decision Makings on Curriculum Reinforcement Learning with
Difficulty Adjustment [52.07473934146584]
We guide the curriculum reinforcement learning results towards a preferred performance level that is neither too hard nor too easy via learning from the human decision process.
Our system is highly parallelizable, making it possible for a human to train large-scale reinforcement learning applications.
It shows reinforcement learning performance can successfully adjust in sync with the human desired difficulty level.
arXiv Detail & Related papers (2022-08-04T23:53:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.