On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
- URL: http://arxiv.org/abs/2406.16377v1
- Date: Mon, 24 Jun 2024 07:42:32 GMT
- Title: On the Transformations across Reward Model, Parameter Update, and In-Context Prompt
- Authors: Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi,
- Abstract summary: In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting.
Our work offers a holistic view that unifies numerous existing studies and suggests potential research directions.
- Score: 83.48364984314127
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications. In this paper, we demonstrate the interchangeability of three popular and distinct adaptation tools: parameter updating, reward modeling, and in-context prompting. This interchangeability establishes a triangular framework with six transformation directions, each of which facilitates a variety of applications. Our work offers a holistic view that unifies numerous existing studies and suggests potential research directions. We envision our work as a useful roadmap for future research on LLMs.
Related papers
- Analyzing Fine-tuning Representation Shift for Multimodal LLMs Steering alignment [53.90425382758605]
We show how fine-tuning alters the internal structure of a model to specialize in new multimodal tasks.
Our work sheds light on how multimodal representations evolve through fine-tuning and offers a new perspective for interpreting model adaptation in multimodal tasks.
arXiv Detail & Related papers (2025-01-06T13:37:13Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - A Practitioner's Guide to Continual Multimodal Pretraining [83.63894495064855]
Multimodal foundation models serve numerous applications at the intersection of vision and language.
To keep models updated, research into continual pretraining mainly explores scenarios with either infrequent, indiscriminate updates on large-scale new data, or frequent, sample-level updates.
We introduce FoMo-in-Flux, a continual multimodal pretraining benchmark with realistic compute constraints and practical deployment requirements.
arXiv Detail & Related papers (2024-08-26T17:59:01Z) - A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning [0.6247103460512108]
Tool use, planning, and feedback learning are currently three prominent paradigms for developing Large Language Model (LLM)-based agents.
This survey introduces a unified taxonomy to systematically review and discuss these frameworks.
arXiv Detail & Related papers (2024-06-09T14:42:55Z) - Exploring the Transferability of Visual Prompting for Multimodal Large Language Models [47.162575147632396]
Transferable Visual Prompting (TVP) is a simple and effective approach to generate visual prompts that can transfer to different models and improve their performance on downstream tasks after trained on only one model.
We introduce two strategies to address the issue of cross-model feature corruption of existing visual prompting methods and enhance the transferability of the learned prompts.
arXiv Detail & Related papers (2024-04-17T09:39:07Z) - Subequivariant Graph Reinforcement Learning in 3D Environments [34.875774768800966]
We propose a novel setup for morphology-agnostic RL, dubbed Subequivariant Graph RL in 3D environments.
Specifically, we first introduce a new set of more practical yet challenging benchmarks in 3D space.
To optimize the policy over the enlarged state-action space, we propose to inject geometric symmetry.
arXiv Detail & Related papers (2023-05-30T11:34:57Z) - Improving Factuality and Reasoning in Language Models through Multiagent
Debate [95.10641301155232]
We present a complementary approach to improve language responses where multiple language model instances propose and debate their individual responses and reasoning processes over multiple rounds to arrive at a common final answer.
Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.
Our approach may be directly applied to existing black-box models and uses identical procedure and prompts for all tasks we investigate.
arXiv Detail & Related papers (2023-05-23T17:55:11Z) - Instruction-ViT: Multi-Modal Prompts for Instruction Learning in ViT [58.70209492842953]
In this paper, we focus on adapting prompt design based on instruction tuning into a visual transformer model for image classification.
The key idea is to implement multi-modal prompts related to category information to guide the fine-tuning of the model.
Based on the experiments of several image captionining tasks, the performance and domain adaptability were improved.
arXiv Detail & Related papers (2023-04-29T08:59:12Z) - Updater-Extractor Architecture for Inductive World State Representations [0.0]
We propose a transformer-based Updater-Extractor architecture and a training procedure that can work with sequences of arbitrary length.
We explicitly train the model to incorporate incoming information into its world state representation.
Empirically, we investigate the model performance on three different tasks, demonstrating its promise.
arXiv Detail & Related papers (2021-04-12T14:30:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.