Fugu-MT 論文翻訳(概要): Fully Open Source Moxin-7B Technical Report

論文の概要: Fully Open Source Moxin-7B Technical Report

arxiv url: http://arxiv.org/abs/2412.06845v2
Date: Wed, 11 Dec 2024 19:03:58 GMT
ステータス: 翻訳完了
システム内更新日: 2024-12-13 13:50:06.470166
Title: Fully Open Source Moxin-7B Technical Report
Title（参考訳）: 完全なオープンソースMoxin-7B技術レポート
Authors: Pu Zhao, Xuan Shen, Zhenglun Kong, Yixin Shen, Sung-En Chang, Timothy Rupprecht, Lei Lu, Enfu Nan, Changdi Yang, Yumei He, Xingchen Xu, Yu Huang, Wei Wang, Yue Chen, Yong He, Yanzhi Wang,
Abstract要約: 大きな言語モデル(LLM)は、その人気と能力の急激な上昇によって、大きな変革を遂げている。この問題を緩和するために、モデルオープンネスフレームワーク(MOF)に従って開発された完全にオープンソースなLLMであるMoxin 7Bを紹介します。本モデルは,事前学習コードと構成の包括的リリースを通じて,オープンサイエンスのMOF分類レベルを最大化する。
参考スコア（独自算出の注目度）: 38.13392000279939
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities. Leading this evolution are proprietary LLMs like GPT-4 and GPT-o1, which have captured widespread attention in the AI community due to their remarkable performance and versatility. Simultaneously, open-source LLMs, such as LLaMA and Mistral, have made great contributions to the ever-increasing popularity of LLMs due to the ease to customize and deploy the models across diverse applications. Although open-source LLMs present unprecedented opportunities for innovation and research, the commercialization of LLMs has raised concerns about transparency, reproducibility, and safety. Many open-source LLMs fail to meet fundamental transparency requirements by withholding essential components like training code and data, and some use restrictive licenses whilst claiming to be "open-source," which may hinder further innovations on LLMs. To mitigate this issue, we introduce Moxin 7B, a fully open-source LLM developed in accordance with the Model Openness Framework (MOF), a ranked classification system that evaluates AI models based on model completeness and openness, adhering to principles of open science, open source, open data, and open access. Our model achieves the highest MOF classification level of "open science" through the comprehensive release of pre-training code and configurations, training and fine-tuning datasets, and intermediate and final checkpoints. Experiments show that our model achieves superior performance in zero-shot evaluation compared with popular 7B models and performs competitively in few-shot evaluation.
Abstract（参考訳）: 最近、Large Language Models (LLM) は、その人気と能力の急激な上昇により、大きな変革を遂げている。この進化をリードするのが、GPT-4やGPT-o1のようなプロプライエタリなLLMであり、その顕著なパフォーマンスと汎用性により、AIコミュニティで広く注目を集めている。同時に、LLaMAやMistralといったオープンソースのLLMは、さまざまなアプリケーションにまたがるモデルのカスタマイズとデプロイが容易なため、LLMの人気が高まっている。オープンソースLLMは、イノベーションと研究の先例のない機会を提供するが、LCMの商業化は透明性、再現性、安全性に関する懸念を提起している。多くのオープンソース LLM は、トレーニングコードやデータのような不可欠なコンポーネントを保留することで、基本的な透明性要件を満たすことができず、一部では、LLM のさらなる革新を妨げる "オープンソース" であると主張しながら、制限的なライセンスを使用する。この問題を緩和するために、モデルオープンネスフレームワーク(MOF)に従って開発された完全にオープンソースなLCMであるMoxin 7Bを紹介します。本モデルは,事前学習用コードと構成,トレーニングと微調整用データセット,中間チェックポイントと最終チェックポイントの包括的リリースを通じて,"オープンサイエンス"のMOF分類レベルを最大化する。実験により,本モデルは,一般的な7Bモデルと比較してゼロショット評価において優れた性能を示し,少数ショット評価において競争力を発揮することが示された。

論文の概要: Fully Open Source Moxin-7B Technical Report

関連論文リスト