AI Chains: Transparent and Controllable Human-AI Interaction by Chaining
Large Language Model Prompts
- URL: http://arxiv.org/abs/2110.01691v1
- Date: Mon, 4 Oct 2021 19:59:38 GMT
- Title: AI Chains: Transparent and Controllable Human-AI Interaction by Chaining
Large Language Model Prompts
- Authors: Tongshuang Wu, Michael Terry, Carrie J. Cai
- Abstract summary: We introduce the concept of Chaining LLM steps together, where the output of one step becomes the input for the next, thus aggregating the gains per step.
In a 20-person user study, we found that Chaining not only improved the quality of task outcomes, but also significantly enhanced system transparency, controllability, and sense of collaboration.
- Score: 12.73129785710807
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although large language models (LLMs) have demonstrated impressive potential
on simple tasks, their breadth of scope, lack of transparency, and insufficient
controllability can make them less effective when assisting humans on more
complex tasks. In response, we introduce the concept of Chaining LLM steps
together, where the output of one step becomes the input for the next, thus
aggregating the gains per step. We first define a set of LLM primitive
operations useful for Chain construction, then present an interactive system
where users can modify these Chains, along with their intermediate results, in
a modular way. In a 20-person user study, we found that Chaining not only
improved the quality of task outcomes, but also significantly enhanced system
transparency, controllability, and sense of collaboration. Additionally, we saw
that users developed new ways of interacting with LLMs through Chains: they
leveraged sub-tasks to calibrate model expectations, compared and contrasted
alternative strategies by observing parallel downstream effects, and debugged
unexpected model outputs by "unit-testing" sub-components of a Chain. In two
case studies, we further explore how LLM Chains may be used in future
applications.
Related papers
- LLM Chain Ensembles for Scalable and Accurate Data Annotation [1.7388851660609117]
Large language models (LLMs) can perform zero-shot classification, but large-scale deployment can be expensive.
This paper introduces an LLM chain ensemble methodology that aligns multiple LLMs in a sequence, routing data subsets to subsequent models.
Our results show that the chain ensemble method often exceeds the performance of the best individual model in the chain and achieves substantial cost savings.
arXiv Detail & Related papers (2024-10-16T20:03:51Z) - Unveiling LLM Mechanisms Through Neural ODEs and Control Theory [3.4039202831583903]
This study uses Neural Ordinary Differential Equations to unravel the intricate relationships between inputs and outputs in Large Language Models (LLMs)
Neural ODEs play a pivotal role in this investigation by providing a dynamic model that captures the continuous evolution of data within the LLMs.
robust control mechanisms are applied to strategically adjust the model's outputs, ensuring they not only maintain high quality and reliability but also adhere to specific performance criteria.
arXiv Detail & Related papers (2024-06-23T22:56:34Z) - Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation.
We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation.
We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z) - CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without
Full Large Language Model [22.870512676002463]
This paper focuses on Offsite-Tuning (OFT), a representative technique that transfers transformer blocks between centralized LLMs and downstream emulators.
Inspired by these observations, we propose CRaSh, involving Clustering, Removing, and Sharing, a training-free strategy to derive improved emulators from LLMs.
Our findings demonstrate a linear connectivity among these optima falling over the same basin, thereby highlighting the effectiveness of CRaSh and OFT.
arXiv Detail & Related papers (2023-10-24T03:08:58Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - On the Effectiveness of Equivariant Regularization for Robust Online
Continual Learning [17.995662644298974]
Continual Learning (CL) approaches seek to bridge this gap by facilitating the transfer of knowledge to both previous tasks and future ones.
Recent research has shown that self-supervision can produce versatile models that can generalize well to diverse downstream tasks.
We propose Continual Learning via Equivariant Regularization (CLER), an OCL approach that leverages equivariant tasks for self-supervision.
arXiv Detail & Related papers (2023-05-05T16:10:31Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.