Multipath agents for modular multitask ML systems
- URL: http://arxiv.org/abs/2302.02721v1
- Date: Mon, 6 Feb 2023 11:57:45 GMT
- Title: Multipath agents for modular multitask ML systems
- Authors: Andrea Gesmundo
- Abstract summary: The presented work introduces a novel methodology allowing to define multiple methods as distinct agents.
Agents can collaborate and compete to generate and improve ML models for a given tasks.
- Score: 2.579908688646812
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A standard ML model is commonly generated by a single method that specifies
aspects such as architecture, initialization, training data and hyperparameters
configuration. The presented work introduces a novel methodology allowing to
define multiple methods as distinct agents. Agents can collaborate and compete
to generate and improve ML models for a given tasks. The proposed methodology
is demonstrated with the generation and extension of a dynamic modular
multitask ML system solving more than one hundred image classification tasks.
Diverse agents can compete to produce the best performing model for a task by
reusing the modules introduced to the system by competing agents. The presented
work focuses on the study of agents capable of: 1) reusing the modules
generated by concurrent agents, 2) activating in parallel multiple modules in a
frozen state by connecting them with trainable modules, 3) condition the
activation mixture on each data sample by using a trainable router module. We
demonstrate that this simple per-sample parallel routing method can boost the
quality of the combined solutions by training a fraction of the activated
parameters.
Related papers
- Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging [111.8456671452411]
Multi-task learning (MTL) leverages a shared model to accomplish multiple tasks and facilitate knowledge transfer.
We propose a Weight-Ensembling Mixture of Experts (WEMoE) method for multi-task model merging.
We show that WEMoE and E-WEMoE outperform state-of-the-art (SOTA) model merging methods in terms of MTL performance, generalization, and robustness.
arXiv Detail & Related papers (2024-10-29T07:16:31Z) - Towards Modular LLMs by Building and Reusing a Library of LoRAs [64.43376695346538]
We study how to best build a library of adapters given multi-task data.
We introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters.
To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters.
arXiv Detail & Related papers (2024-05-18T03:02:23Z) - Merging Multi-Task Models via Weight-Ensembling Mixture of Experts [64.94129594112557]
Merging Transformer-based models trained on different tasks into a single unified model can execute all the tasks concurrently.
Previous methods, exemplified by task arithmetic, have been proven to be both effective and scalable.
We propose to merge most of the parameters while upscaling the Transformer layers to a weight-ensembling mixture of experts (MoE) module.
arXiv Detail & Related papers (2024-02-01T08:58:57Z) - Contrastive Modules with Temporal Attention for Multi-Task Reinforcement
Learning [29.14234496784581]
We propose Contrastive Modules with Temporal Attention(CMTA) method for multi-task reinforcement learning.
CMTA constrains the modules to be different from each other by contrastive learning and combining shared modules at a finer granularity than the task level.
Experimental results show that CMTA outperforms learning each task individually for the first time and achieves substantial performance improvements.
arXiv Detail & Related papers (2023-11-02T08:41:00Z) - Towards Robust Multi-Modal Reasoning via Model Selection [7.6621866737827045]
LLM serves as the "brain" of the agent, orchestrating multiple tools for collaborative multi-step task solving.
We propose the $textitM3$ framework as a plug-in with negligible runtime overhead at test-time.
Our experiments reveal that our framework enables dynamic model selection, considering both user inputs and subtask dependencies.
arXiv Detail & Related papers (2023-10-12T16:06:18Z) - BYOM: Building Your Own Multi-Task Model For Free [69.63765907216442]
BYOM-FFT is for merging fully finetuned models, while BYOM-LoRA is for LoRA-finetuned models.
Experiments on computer vision and natural language processing tasks show that the proposed BYOM methods outperform existing merging methods by a large margin.
arXiv Detail & Related papers (2023-10-03T08:39:33Z) - OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist
Models [72.8156832931841]
Generalist models are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model.
We release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction.
arXiv Detail & Related papers (2022-12-08T17:07:09Z) - Compositional Models: Multi-Task Learning and Knowledge Transfer with
Modular Networks [13.308477955656592]
We propose a new approach for learning modular networks based on the isometric version of ResNet.
In our method, the modules can be invoked repeatedly and allow knowledge transfer to novel tasks.
We show that our method leads to interpretable self-organization of modules in case of multi-task learning, transfer learning and domain adaptation.
arXiv Detail & Related papers (2021-07-23T00:05:55Z) - MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning [61.28547338576706]
Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms.
We present MALib, a scalable and efficient computing framework for PB-MARL.
arXiv Detail & Related papers (2021-06-05T03:27:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.