Fugu-MT 論文翻訳(概要): Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions

論文の概要: Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions

arxiv url: http://arxiv.org/abs/2503.16585v1
Date: Thu, 20 Mar 2025 15:18:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-03-24 15:40:10.136613
Title: Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions
Title（参考訳）: 分散LLMとマルチモーダル大言語モデル:進歩,課題,今後の方向性に関する調査
Authors: Hadi Amini, Md Jueal Mia, Yasaman Saadati, Ahmed Imteaj, Seyedsina Nabavirazavi, Urmish Thakker, Md Zarif Hossain, Awal Ahmed Fime, S. S. Iyengar,
Abstract要約: 言語モデル (LM) は、テキストなどの大規模データセットに基づいて単語列の確率を推定することにより、言語パターンを予測する機械学習モデルである。より大きなデータセットは一般的にLM性能を高めるが、計算能力とリソースの制約のためスケーラビリティは依然として課題である。近年の研究では、分散トレーニングと推論を可能にする分散型技術の開発に焦点が当てられている。
参考スコア（独自算出の注目度）: 1.3638337521666275
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models (LMs) are machine learning models designed to predict linguistic patterns by estimating the probability of word sequences based on large-scale datasets, such as text. LMs have a wide range of applications in natural language processing (NLP) tasks, including autocomplete and machine translation. Although larger datasets typically enhance LM performance, scalability remains a challenge due to constraints in computational power and resources. Distributed computing strategies offer essential solutions for improving scalability and managing the growing computational demand. Further, the use of sensitive datasets in training and deployment raises significant privacy concerns. Recent research has focused on developing decentralized techniques to enable distributed training and inference while utilizing diverse computational resources and enabling edge AI. This paper presents a survey on distributed solutions for various LMs, including large language models (LLMs), vision language models (VLMs), multimodal LLMs (MLLMs), and small language models (SLMs). While LLMs focus on processing and generating text, MLLMs are designed to handle multiple modalities of data (e.g., text, images, and audio) and to integrate them for broader applications. To this end, this paper reviews key advancements across the MLLM pipeline, including distributed training, inference, fine-tuning, and deployment, while also identifying the contributions, limitations, and future areas of improvement. Further, it categorizes the literature based on six primary focus areas of decentralization. Our analysis describes gaps in current methodologies for enabling distributed solutions for LMs and outline future research directions, emphasizing the need for novel solutions to enhance the robustness and applicability of distributed LMs.
Abstract（参考訳）: 言語モデル (LM) は、テキストなどの大規模データセットに基づいて単語列の確率を推定することにより、言語パターンを予測する機械学習モデルである。 LMは、自動補完や機械翻訳を含む自然言語処理(NLP)タスクに幅広く応用されている。より大きなデータセットは一般的にLM性能を高めるが、計算能力とリソースの制約のためスケーラビリティは依然として課題である。分散コンピューティング戦略は、スケーラビリティを改善し、増大する計算需要を管理するための重要なソリューションを提供する。さらに、トレーニングとデプロイメントにセンシティブなデータセットを使用することで、プライバシの懸念が高まる。近年の研究では、分散トレーニングと推論を可能にし、多様な計算資源を活用し、エッジAIを実現する分散型技術の開発に焦点が当てられている。本稿では,大規模言語モデル(LLM),視覚言語モデル(VLM),マルチモーダルLSM(MLLM),小型言語モデル(SLM)など,多種多様なLMの分散ソリューションに関する調査を行う。 LLMはテキストの処理と生成に重点を置いているが、MLLMは複数のデータ(例えば、テキスト、画像、音声)を処理し、それらをより広範なアプリケーションに組み込むように設計されている。この目的のために、本稿では、分散トレーニング、推論、微調整、デプロイメントを含むMLLMパイプライン全体の重要な進歩をレビューするとともに、コントリビューション、制限、将来の改善領域を特定します。さらに、分散化の6つの主要分野に基づいて、文学を分類する。本分析では, 分散ソリューションを実現するための現在の手法のギャップを概説し, 分散LMの堅牢性と適用性を高めるための新しいソリューションの必要性を強調した。

論文の概要: Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future Directions

関連論文リスト