Related papers: Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers

Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers

URL: http://arxiv.org/abs/2310.02905v3
Date: Sun, 23 Jun 2024 23:59:53 GMT
Title: Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers
Authors: Xiaoqiang Lin, Zhaoxuan Wu, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low,
Abstract summary: Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. Recent work has used the query-efficient Bayesian optimization (BO) algorithm to automatically optimize the instructions given to black-box LLMs. We propose a neural bandit algorithm which replaces the GP in BO by an NN surrogate to optimize instructions for black-box LLMs.
Score: 66.823588073584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. However, the performances of LLMs depend heavily on the instructions given to them, which are typically manually tuned with substantial human efforts. Recent work has used the query-efficient Bayesian optimization (BO) algorithm to automatically optimize the instructions given to black-box LLMs. However, BO usually falls short when optimizing highly sophisticated (e.g., high-dimensional) objective functions, such as the functions mapping an instruction to the performance of an LLM. This is mainly due to the limited expressive power of the Gaussian process (GP) which is used by BO as a surrogate to model the objective function. Meanwhile, it has been repeatedly shown that neural networks (NNs), especially pre-trained transformers, possess strong expressive power and can model highly complex functions. So, we adopt a neural bandit algorithm which replaces the GP in BO by an NN surrogate to optimize instructions for black-box LLMs. More importantly, the neural bandit algorithm allows us to naturally couple the NN surrogate with the hidden representation learned by a pre-trained transformer (i.e., an open-source LLM), which significantly boosts its performance. These motivate us to propose our INSTruction optimization usIng Neural bandits Coupled with Transformers (INSTINCT) algorithm. We perform instruction optimization for ChatGPT and use extensive experiments to show that INSTINCT consistently outperforms baselines in different tasks, e.g., various instruction induction tasks and the task of improving zero-shot chain-of-thought instructions. Our code is available at https://github.com/xqlin98/INSTINCT.

Related papers

Learning a Zeroth-Order Optimizer for Fine-Tuning LLMs [22.39397810186991]
We propose ZO Fine-tuner, a learning-based zeroth-order for large language models.<n>It automatically learns efficient perturbation strategies through a compact and memory-efficient design.<n>Experiments show that ZO Fine-tuner outperforms prior zeroth-order baselines in 82.1% of task-model combinations.
arXiv Detail & Related papers (2025-10-01T02:01:07Z)
Reparameterized LLM Training via Orthogonal Equivalence Transformation [54.80172809738605]
We present POET, a novel training algorithm that uses Orthogonal Equivalence Transformation to optimize neurons.<n>POET can stably optimize the objective function with improved generalization.<n>We develop efficient approximations that make POET flexible and scalable for training large-scale neural networks.
arXiv Detail & Related papers (2025-06-09T17:59:34Z)
LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms [0.01874930567916036]
Large Language Models (LLMs) have opened new avenues for automating scientific discovery.<n>Our framework uses an evolution strategy to guide an LLM in generating Python code that preserves the key components of BO algorithms.<n>Despite no additional fine-tuning, the LLM-generated algorithms outperform state-of-the-art BO baselines in 19 (out of 24) BBOB functions in 5 and well to generalize to higher dimensions, and different tasks.
arXiv Detail & Related papers (2025-05-27T11:13:14Z)
Meta-Prompt Optimization for LLM-Based Sequential Decision Making [24.050701239196876]
Large language models (LLMs) have been employed as agents to solve sequential decision-making tasks. We propose our EXPonential-weight algorithm for prompt Optimization (EXPO) to automatically optimize the task description and meta-instruction in the meta-prompt. We also extend EXPO to additionally optimize the exemplars in the meta-prompt to further enhance the performance.
arXiv Detail & Related papers (2025-02-02T09:22:39Z)
Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures [21.18741772731095]
Zeroth-order (ZO) algorithms offer a promising alternative by approximating gradients using finite differences of function values. Existing ZO methods struggle to capture the low-rank gradient structure common in LLM fine-tuning, leading to suboptimal performance. This paper proposes a low-rank ZO algorithm (LOZO) that effectively captures this structure in LLMs.
arXiv Detail & Related papers (2024-10-10T08:10:53Z)
Algorithmic Language Models with Neurally Compiled Libraries [16.284360949127723]
Large Language Models lack true algorithmic ability. Our paper proposes augmenting LLMs with a library of fundamental operations and sophisticated differentiable programs. We explore the feasability of augmenting LLaMA3 with a differentiable computer.
arXiv Detail & Related papers (2024-07-06T00:27:05Z)
Large Language Models As Evolution Strategies [6.873777465945062]
In this work, we investigate whether large language models (LLMs) are in principle capable of implementing evolutionary optimization algorithms. We introduce a novel prompting strategy, consisting of least-to-most sorting of discretized population members. We find that our setup allows the user to obtain an LLM-based evolution strategy, which we call EvoLLM', that robustly outperforms baseline algorithms.
arXiv Detail & Related papers (2024-02-28T15:02:17Z)
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark [166.40879020706151]
This paper proposes a shift towards BP-free, zeroth-order (ZO) optimization as a solution for reducing memory costs during fine-tuning. Unlike traditional ZO-SGD methods, our work expands the exploration to a wider array of ZO optimization techniques. Our study unveils previously overlooked optimization principles, highlighting the importance of task alignment, the role of the forward gradient method, and the balance between algorithm complexity and fine-tuning performance.
arXiv Detail & Related papers (2024-02-18T14:08:48Z)
VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles. We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates. We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Can we learn gradients by Hamiltonian Neural Networks? [68.8204255655161]
We propose a meta-learner based on ODE neural networks that learns gradients. We demonstrate that our method outperforms a meta-learner based on LSTM for an artificial task and the MNIST dataset with ReLU activations in the optimizee.
arXiv Detail & Related papers (2021-10-31T18:35:10Z)
Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves [53.37905268850274]
We introduce a new, hierarchical, neural network parameterized, hierarchical with access to additional features such as validation loss to enable automatic regularization. Most learneds have been trained on only a single task, or a small number of tasks. We train ours on thousands of tasks, making use of orders of magnitude more compute, resulting in generalizes that perform better to unseen tasks.
arXiv Detail & Related papers (2020-09-23T16:35:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.