TEGEE: Task dEfinition Guided Expert Ensembling for Generalizable and Few-shot Learning
- URL: http://arxiv.org/abs/2403.04233v3
- Date: Sat, 14 Dec 2024 14:39:57 GMT
- Title: TEGEE: Task dEfinition Guided Expert Ensembling for Generalizable and Few-shot Learning
- Authors: Xingwei Qu, Yiming Liang, Yucheng Wang, Tianyu Zheng, Tommy Yue, Xingyuan Bu, Lei Ma, Stephen W. Huang, Jiajun Zhang, Yinan Shi, Chenghua Lin, Jie Fu, Ge Zhang,
- Abstract summary: We propose textbfTEGEE (Task Definition Guided Expert Ensembling), a method that explicitly extracts task definitions.
Our framework employs a dual 3B model approach, with each model assigned a distinct role.
Empirical evaluations show that TEGEE performs comparably to the larger LLaMA2-13B model.
- Score: 37.09785060896196
- License:
- Abstract: Large Language Models (LLMs) exhibit the ability to perform in-context learning (ICL), where they acquire new tasks directly from examples provided in demonstrations. This process is thought to operate through an implicit task selection mechanism that involves extracting and processing task definitions from these demonstrations. However, critical questions remain: Which is more essential -- task extraction or definition? And how can these capabilities be further improved? To address these questions, we propose \textbf{TEGEE} (Task Definition Guided Expert Ensembling), a method that explicitly extracts task definitions and generates responses based on specific tasks. Our framework employs a dual 3B model approach, with each model assigned a distinct role: one focuses on task definition extraction, while the other handles learning from demonstrations. This modular approach supports the hypothesis that extracting task definitions is more vital than processing the task itself. Empirical evaluations show that TEGEE performs comparably to the larger LLaMA2-13B model. By leveraging a modular design, our approach extends traditional ICL from few-shot to many-shot learning, supporting an unlimited number of demonstrations and enhancing continual learning capabilities.
Related papers
- Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.
We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.
We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z) - InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models [11.913271486031201]
We develop a Context-aware instructional task assistant with multi-modal large language models (InsTALL)
InsTALL responds in real-time to user queries related to the task at hand.
We show InsTALL achieves state-of-the-art performance across proposed sub-tasks considered for multimodal activity understanding.
arXiv Detail & Related papers (2025-01-21T15:55:06Z) - UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model [11.885204227946549]
We propose a comprehensive model designed to represent various tasks using a unified representation.
Our model exhibits strong capabilities in comprehending the implicit intent of user instructions.
Our approach exhibits exceptional scalability and generality.
arXiv Detail & Related papers (2024-08-05T14:27:39Z) - Exploring the Transferability of Visual Prompting for Multimodal Large Language Models [47.162575147632396]
Transferable Visual Prompting (TVP) is a simple and effective approach to generate visual prompts that can transfer to different models and improve their performance on downstream tasks after trained on only one model.
We introduce two strategies to address the issue of cross-model feature corruption of existing visual prompting methods and enhance the transferability of the learned prompts.
arXiv Detail & Related papers (2024-04-17T09:39:07Z) - Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond [16.913115978881866]
We propose a framework for unified task embeddings (FUTE), task embeddings from various models, including smaller language models and Large Language Models with varied prompts, within a single vector space.
Such uniformity enables comparison and analysis of similarities amongst different models, broadening the scope and utility of existing task embedding methods in multi-model scenarios.
arXiv Detail & Related papers (2024-02-22T13:13:31Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - Multitask Multimodal Prompted Training for Interactive Embodied Task
Completion [48.69347134411864]
Embodied MultiModal Agent (EMMA) is a unified encoder-decoder model that reasons over images and trajectories.
By unifying all tasks as text generation, EMMA learns a language of actions which facilitates transfer across tasks.
arXiv Detail & Related papers (2023-11-07T15:27:52Z) - Active Instruction Tuning: Improving Cross-Task Generalization by
Training on Prompt Sensitive Tasks [101.40633115037983]
Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions.
How to select new tasks to improve the performance and generalizability of IT models remains an open question.
We propose active instruction tuning based on prompt uncertainty, a novel framework to identify informative tasks, and then actively tune the models on the selected tasks.
arXiv Detail & Related papers (2023-11-01T04:40:05Z) - Did You Read the Instructions? Rethinking the Effectiveness of Task
Definitions in Instruction Learning [74.70157466822612]
We systematically study the role of task definitions in instruction learning.
We find that model performance drops substantially when removing contents describing the task output.
We propose two strategies to help models better leverage task instructions.
arXiv Detail & Related papers (2023-06-01T21:11:24Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.