Fugu-MT 論文翻訳(概要): Deep Model Fusion: A Survey

論文の概要: Deep Model Fusion: A Survey

arxiv url: http://arxiv.org/abs/2309.15698v1
Date: Wed, 27 Sep 2023 14:40:12 GMT
ステータス: 翻訳完了
システム内更新日: 2023-09-28 13:12:12.064955
Title: Deep Model Fusion: A Survey
Title（参考訳）: Deep Model Fusion: 調査
Authors: Weishi Li, Yong Peng, Miao Zhang, Liang Ding, Han Hu, Li Shen
Abstract要約: Deep Model fusion/mergingは、複数のディープラーニングモデルのパラメータや予測を単一のモデルにマージする、新たなテクニックである。高い計算コスト、高次元パラメータ空間、異なる異種モデル間の干渉など、いくつかの課題に直面している。
参考スコア（独自算出の注目度）: 37.39100741978586
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep model fusion/merging is an emerging technique that merges the parameters or predictions of multiple deep learning models into a single one. It combines the abilities of different models to make up for the biases and errors of a single model to achieve better performance. However, deep model fusion on large-scale deep learning models (e.g., LLMs and foundation models) faces several challenges, including high computational cost, high-dimensional parameter space, interference between different heterogeneous models, etc. Although model fusion has attracted widespread attention due to its potential to solve complex real-world tasks, there is still a lack of complete and detailed survey research on this technique. Accordingly, in order to understand the model fusion method better and promote its development, we present a comprehensive survey to summarize the recent progress. Specifically, we categorize existing deep model fusion methods as four-fold: (1) "Mode connectivity", which connects the solutions in weight space via a path of non-increasing loss, in order to obtain better initialization for model fusion; (2) "Alignment" matches units between neural networks to create better conditions for fusion; (3) "Weight average", a classical model fusion method, averages the weights of multiple models to obtain more accurate results closer to the optimal solution; (4) "Ensemble learning" combines the outputs of diverse models, which is a foundational technique for improving the accuracy and robustness of the final model. In addition, we analyze the challenges faced by deep model fusion and propose possible research directions for model fusion in the future. Our review is helpful in deeply understanding the correlation between different model fusion methods and practical application methods, which can enlighten the research in the field of deep model fusion.
Abstract（参考訳）: deep model fusion/mergingは、複数のディープラーニングモデルのパラメータや予測を単一のものにマージする、新たなテクニックだ。異なるモデルの能力を組み合わせて、1つのモデルのバイアスとエラーを補い、より良いパフォーマンスを達成する。しかし、大規模ディープラーニングモデル(LLMや基礎モデルなど)における深層モデルの融合は、高い計算コスト、高次元パラメータ空間、異なる異種モデル間の干渉など、いくつかの課題に直面している。モデル融合は複雑な実世界のタスクを解決できる可能性から広く注目されているが、この手法に関する完全な詳細な調査研究が不足している。そこで本研究では,モデル融合法をよりよく理解し,開発を促進するために,最近の進歩を概観する包括的調査を行う。 Specifically, we categorize existing deep model fusion methods as four-fold: (1) "Mode connectivity", which connects the solutions in weight space via a path of non-increasing loss, in order to obtain better initialization for model fusion; (2) "Alignment" matches units between neural networks to create better conditions for fusion; (3) "Weight average", a classical model fusion method, averages the weights of multiple models to obtain more accurate results closer to the optimal solution; (4) "Ensemble learning" combines the outputs of diverse models, which is a foundational technique for improving the accuracy and robustness of the final model. さらに,深層モデル融合が直面する課題を分析し,将来的なモデル融合研究の方向性を提案する。本稿では,異なるモデル融合法と実用的応用法との相関関係を深く理解し,深層モデル融合の分野での研究を啓蒙する上で有用である。

論文の概要: Deep Model Fusion: A Survey

関連論文リスト