AsynDBT: Asynchronous Distributed Bilevel Tuning for efficient In-Context Learning with Large Language Models
- URL: http://arxiv.org/abs/2602.17694v1
- Date: Fri, 06 Feb 2026 13:07:49 GMT
- Title: AsynDBT: Asynchronous Distributed Bilevel Tuning for efficient In-Context Learning with Large Language Models
- Authors: Hui Ma, Shaoyu Dou, Ya Liu, Fei Xing, Li Feng, Feng Pi,
- Abstract summary: In-context learning (ICL) has emerged as a promising paradigm that enables LLMs to adapt to new tasks using examples provided within the input.<n>Previous FL approaches that incorporate ICL have struggled with severe straggler problems and challenges associated with heterogeneous non-identically data.<n>We propose an asynchronous distributed bilevel tuning (AsynDBT) algorithm that optimize both in-context learning samples and prompt fragments based on the feedback from the LLM.
- Score: 4.4866154758274375
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rapid development of large language models (LLMs), an increasing number of applications leverage cloud-based LLM APIs to reduce usage costs. However, since cloud-based models' parameters and gradients are agnostic, users have to manually or use heuristic algorithms to adjust prompts for intervening LLM outputs, which requiring costly optimization procedures. In-context learning (ICL) has recently emerged as a promising paradigm that enables LLMs to adapt to new tasks using examples provided within the input, eliminating the need for parameter updates. Nevertheless, the advancement of ICL is often hindered by the lack of high-quality data, which is often sensitive and different to share. Federated learning (FL) offers a potential solution by enabling collaborative training of distributed LLMs while preserving data privacy. Despite this issues, previous FL approaches that incorporate ICL have struggled with severe straggler problems and challenges associated with heterogeneous non-identically data. To address these problems, we propose an asynchronous distributed bilevel tuning (AsynDBT) algorithm that optimizes both in-context learning samples and prompt fragments based on the feedback from the LLM, thereby enhancing downstream task performance. Benefiting from its distributed architecture, AsynDBT provides privacy protection and adaptability to heterogeneous computing environments. Furthermore, we present a theoretical analysis establishing the convergence guarantees of the proposed algorithm. Extensive experiments conducted on multiple benchmark datasets demonstrate the effectiveness and efficiency of AsynDBT.
Related papers
- A Comprehensive Study on Visual Token Redundancy for Discrete Diffusion-based Multimodal Large Language Models [85.30893355216486]
We study how visual token redundancy evolves with different dMLLM architectures and tasks.<n>Our study reveals that visual redundancy emerges only in from-scratch dMLLMs while handling long-answer tasks.<n>Layer-skipping is promising for accelerating AR-to-diffusion dMLLMs, whereas progressive or late-step pruning is more effective for from-scratch dMLLMs.
arXiv Detail & Related papers (2025-11-19T04:13:36Z) - Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning [34.343514432589586]
This paper proposes textbf, a novel framework that integrates Low-Rank Adaptation (LoRA) with a continuous fine-tuning strategy.<n> Experiments on 15 diverse datasets show that DEAL consistently outperforms baseline methods.<n>These findings demonstrate the potential of our approach to advance continual adaptation in Large Language Models.
arXiv Detail & Related papers (2025-09-23T12:55:57Z) - Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory Constraints [14.341123057506827]
Large Language Models (LLMs) are indispensable in today's applications, but their inference procedure demands significant computational resources.<n>This paper formulates LLM inference optimization as a multi-stage online scheduling problem.<n>We develop a fluid dynamics approximation to provide a tractable benchmark that guides algorithm design.
arXiv Detail & Related papers (2025-04-15T16:00:21Z) - Split Federated Learning Over Heterogeneous Edge Devices: Algorithm and Optimization [7.013344179232109]
Split Learning (SL) is a promising collaborative machine learning approach, enabling resource-constrained devices to train models without sharing raw data.
Current SL algorithms face limitations in training efficiency and suffer from prolonged latency.
We propose the Heterogeneous Split Federated Learning framework, which allows resource-constrained clients to train their personalized client-side models in parallel.
arXiv Detail & Related papers (2024-11-21T07:46:01Z) - SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe [36.74756622715754]
Large language models (LLMs) undergo instruction tuning to acquire instruction-following capabilities.<n>Efforts to improve instruction tuning often focus on higher-quality supervised fine-tuning datasets.<n>We propose SFTMix, a novel Mixup-based recipe that elevates LLM instruction tuning without relying on well-curated datasets.
arXiv Detail & Related papers (2024-10-07T17:52:21Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z) - Genetic CFL: Optimization of Hyper-Parameters in Clustered Federated
Learning [4.710427287359642]
Federated learning (FL) is a distributed model for deep learning that integrates client-server architecture, edge computing, and real-time intelligence.
FL has the capability of revolutionizing machine learning (ML) but lacks in the practicality of implementation due to technological limitations, communication overhead, non-IID (independent and identically distributed) data, and privacy concerns.
We propose a novel hybrid algorithm, namely genetic clustered FL (Genetic CFL), that clusters edge devices based on the training hyper- parameters and genetically modifies the parameters cluster-wise.
arXiv Detail & Related papers (2021-07-15T10:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.