Prompt Optimization via Adversarial In-Context Learning
- URL: http://arxiv.org/abs/2312.02614v3
- Date: Sat, 22 Jun 2024 15:19:11 GMT
- Title: Prompt Optimization via Adversarial In-Context Learning
- Authors: Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Shieh, Junxian He,
- Abstract summary: adv-ICL is implemented as a two-player game between a generator and a discriminator.
The generator tries to generate realistic enough output to fool the discriminator.
We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques.
- Score: 51.18075178593142
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier. As in traditional adversarial learning, adv-ICL is implemented as a two-player game between the generator and discriminator, where the generator tries to generate realistic enough output to fool the discriminator. In each round, given an input prefixed by task instructions and several exemplars, the generator produces an output. The discriminator is then tasked with classifying the generator input-output pair as model-generated or real data. Based on the discriminator loss, the prompt modifier proposes possible edits to the generator and discriminator prompts, and the edits that most improve the adversarial loss are selected. We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques for both open and closed-source models on 11 generation and classification tasks including summarization, arithmetic reasoning, machine translation, data-to-text generation, and the MMLU and big-bench hard benchmarks. In addition, because our method uses pre-trained models and updates only prompts rather than model parameters, it is computationally efficient, easy to extend to any LLM and task, and effective in low-resource settings.
Related papers
- Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - Generative Verifiers: Reward Modeling as Next-Token Prediction [29.543787728397643]
Verifiers or reward models are often used to enhance the reasoning performance of large language models (LLMs)
We propose training verifiers using the ubiquitous next-token prediction objective, jointly on verification and solution generation.
We demonstrate that GenRM outperforms discriminative, DPO verifiers, and LLM-as-a-Judge.
arXiv Detail & Related papers (2024-08-27T17:57:45Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Adaptive Draft-Verification for Efficient Large Language Model Decoding [24.347886232342862]
Large language model (LLM) decoding involves generating a sequence of tokens based on a given context.
The typical autoregressive decoding method requires a separate forward pass through the model for each token generated.
We introduce ADED, which accelerates LLM decoding without requiring fine-tuning.
arXiv Detail & Related papers (2024-06-27T22:20:39Z) - Generative error correction for code-switching speech recognition using
large language models [49.06203730433107]
Code-switching (CS) speech refers to the phenomenon of mixing two or more languages within the same sentence.
We propose to leverage large language models (LLMs) and lists of hypotheses generated by an ASR to address the CS problem.
arXiv Detail & Related papers (2023-10-17T14:49:48Z) - GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator [114.8954615026781]
We propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator.
GanLM is trained with two pre-training objectives: replaced token detection and replaced token denoising.
Experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained language models.
arXiv Detail & Related papers (2022-12-20T12:51:11Z) - Selective Token Generation for Few-shot Natural Language Generation [19.015739016376532]
We develop a novel additive learning algorithm based on reinforcement learning (RL)
We show that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.
arXiv Detail & Related papers (2022-09-17T00:48:52Z) - Adversarial Soft Advantage Fitting: Imitation Learning without Policy
Optimization [48.674944885529165]
Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones -- and a generator's policy to produce trajectories that can fool this discriminator.
We propose to remove the burden of the policy optimization steps by leveraging a novel discriminator formulation.
arXiv Detail & Related papers (2020-06-23T18:29:13Z) - Adding A Filter Based on The Discriminator to Improve Unconditional Text
Generation [35.122864215334836]
The autoregressive language model (ALM) trained with maximum likelihood estimation (MLE) is widely used in unconditional text generation.
Due to exposure bias, the generated texts still suffer from low quality and diversity.
Some research shows a discriminator can detect this discrepancy.
We propose a novel mechanism by adding a filter which has the same input as the discriminator.
arXiv Detail & Related papers (2020-04-05T09:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.