Related papers: MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts

MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts

URL: http://arxiv.org/abs/2404.09027v1
Date: Sat, 13 Apr 2024 15:28:52 GMT
Title: MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts
Authors: Yusheng Liao, Shuyang Jiang, Yu Wang, Yanfeng Wang,
Abstract summary: This paper introduces MING-MOE, a novel Mixture-of-Expert(MOE)-based medical large language model. It is designed to manage diverse and complex medical tasks without requiring task-specific annotations. It achieves state-of-the-art (SOTA) performance on over 20 medical tasks, illustrating a significant improvement over existing models.
Score: 22.596827147978598
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models like ChatGPT have shown substantial progress in natural language understanding and generation, proving valuable across various disciplines, including the medical field. Despite advancements, challenges persist due to the complexity and diversity inherent in medical tasks which often require multi-task learning capabilities. Previous approaches, although beneficial, fall short in real-world applications because they necessitate task-specific annotations at inference time, limiting broader generalization. This paper introduces MING-MOE, a novel Mixture-of-Expert~(MOE)-based medical large language model designed to manage diverse and complex medical tasks without requiring task-specific annotations, thus enhancing its usability across extensive datasets. MING-MOE employs a Mixture of Low-Rank Adaptation (MoLoRA) technique, allowing for efficient parameter usage by maintaining base model parameters static while adapting through a minimal set of trainable parameters. We demonstrate that MING-MOE achieves state-of-the-art (SOTA) performance on over 20 medical tasks, illustrating a significant improvement over existing models. This approach not only extends the capabilities of medical language models but also improves inference efficiency.

Related papers

Adaptation of Multi-modal Representation Models for Multi-task Surgical Computer Vision [1.890063512530524]
MML-SurgAdapt is a unified multi-task framework to handle diverse surgical tasks through natural language supervision.<n>A key challenge in multi-task learning is the presence of partial annotations when integrating different tasks.<n>Our framework extends this approach to integrate data from multiple surgical tasks within a single procedure, enabling effective learning despite incomplete or noisy annotations.
arXiv Detail & Related papers (2025-07-07T14:03:10Z)
UMIT: Unifying Medical Imaging Tasks via Vision-Language Models [17.65946656129399]
UMIT is a unified multi-modal, multi-task VLM designed specifically for medical imaging tasks. It is able to solve various tasks, including visual question answering, disease detection, and medical report generation. It supports both English and Chinese, expanding its applicability globally.
arXiv Detail & Related papers (2025-03-20T06:43:36Z)
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks. Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs. In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z)
EMMA: Efficient Visual Alignment in Multi-Modal LLMs [56.03417732498859]
EMMA is a lightweight cross-modality module designed to efficiently fuse visual and textual encodings. EMMA boosts performance across multiple tasks by up to 9.3% while significantly improving robustness against hallucinations.
arXiv Detail & Related papers (2024-10-02T23:00:31Z)
MedViLaM: A multimodal large language model with advanced generalizability and explainability for medical data understanding and generation [40.9095393430871]
We introduce MedViLaM, a unified vision-language model towards a generalist model for medical data. MedViLaM can flexibly encode and interpret various forms of medical data, including clinical language and imaging. We present instances of zero-shot generalization to new medical concepts and tasks, effective transfer learning across different tasks, and the emergence of zero-shot medical reasoning.
arXiv Detail & Related papers (2024-09-29T12:23:10Z)
Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach [6.921012069327385]
Open-source, multilingual medical large language models (LLMs) have the potential to serve linguistically diverse populations across different regions. We introduce two multilingual instruction fine-tuning datasets, containing over 200k high-quality medical samples in six languages. We propose a two-stage training paradigm: the first stage injects general medical knowledge using MMed-IFT, while the second stage fine-tunes task-specific multiple-choice questions with MMed-IFT-MC.
arXiv Detail & Related papers (2024-09-09T15:42:19Z)
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models [17.643421997037514]
We propose a novel framework that tackles both discriminative and generative multimodal medical tasks. The learning of Med-MoE consists of three steps: multimodal medical alignment, instruction tuning and routing, and domain-specific MoE tuning. Our model can achieve performance superior to or on par with state-of-the-art baselines.
arXiv Detail & Related papers (2024-04-16T02:35:17Z)
Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks. Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z)
MedXChat: A Unified Multimodal Large Language Model Framework towards CXRs Understanding and Generation [28.497591315598402]
Multimodal Large Language Models (MLLMs) have shown success in various general image processing tasks. This study investigates the potential of MLLMs in improving the understanding and generation of Chest X-Rays (CXRs)
arXiv Detail & Related papers (2023-12-04T06:40:12Z)
When MOE Meets LLMs: Parameter Efficient Fine-tuning for Multi-task Medical Applications [57.342772288710044]
We propose a novel parameter efficient fine-tuning framework for multi-task medical applications, dubbed as MOELoRA. For unifying MOE and LoRA, we devise multiple experts as the trainable parameters, where each expert consists of a pair of low-rank matrices to retain the small size of trainable parameters. We conduct experiments on a multi-task medical dataset, indicating MOELoRA outperforms the existing parameter efficient fine-tuning methods.
arXiv Detail & Related papers (2023-10-21T17:18:09Z)
Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining [121.89793208683625]
Medical artificial general intelligence (MAGI) enables one foundation model to solve different medical tasks. We propose a new paradigm called Medical-knedge-enhanced mulTimOdal pretRaining (MOTOR)
arXiv Detail & Related papers (2023-04-26T01:26:19Z)
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models [63.92643612630657]
This paper attempts to peek into the black-box of multilingual optimization through the lens of loss function geometry. We find that gradient similarity measured along the optimization trajectory is an important signal, which correlates well with language proximity. We derive a simple and scalable optimization procedure, named Gradient Vaccine, which encourages more geometrically aligned parameter updates for close tasks.
arXiv Detail & Related papers (2020-10-12T17:26:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.