Fugu-MT 論文翻訳(概要): JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

論文の概要: JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

arxiv url: http://arxiv.org/abs/2604.03044v1
Date: Fri, 03 Apr 2026 13:52:38 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 17:20:24.486767
Title: JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency
Title（参考訳）: JoyAI-LLM Flash: トークン効率で中規模のLLMを改善する
Authors: Aichen Cai, Anmeng Zhang, Anyu Li, Bo Zhang, Bohua Cai, Chang Li, Changjian Jiang, Changkai Lu, Chao Xue, Chaocai Liang, Cheng Zhang, Dongkai Liu, Fei Wang, Guoqiang Huang, Haijian Ke, Han Lin, Hao Wang, Ji Miao, Jiacheng Zhang, Jialong Shi, Jifeng Zhu, Jingjing Qian, Junhui Luo, Junwu Xiong, Lam So, Liang Huang, Ming Ke, Mingyang Li, Panfeng Shi, Peng Hao, Qi Wang, Qian Lai, Qiaoqiao Yuan, Qingyu Yin, Qiong Cao, Qixiang Wang, Rongcheng Bian, Rongduo Han, Shaoqiang Zheng, Shi Hu, Shi Suo, Shijie Ren, Shijin Zhang, Shiying Fan, Shuai Xie, Tianyi Zhang, Wei Liu, Wentao Tan, Xianghan Meng, Xiaodong He, Xing Pan, Xiran Wang, Xuyang Peng, Ya Zhang, Yang Liu, Yangyang Duan, Yanxu Chen, Yicheng Gong, Yidan Huang, Yifei Liu, Yinhao Bai, Yongqiang Liu, Yuesong Zhang, Yuqi Zhang, Zerui Xie, Zhenfang Wang, Zhennan Shen, Zheyuan Liu, Zhuwei Zeng,
Abstract要約: JoyAI-LLM Flashは、強力なパフォーマンスとトークン効率のトレードオフを再定義するために設計された、効率的なMixture-of-Experts (MoE)言語モデルである。 JoyAI-LLM Flashは20兆トークンの巨大なコーパスで事前トレーニングされており、厳格なポストトレーニングパイプラインを通じてさらに最適化されている。
参考スコア（独自算出の注目度）: 56.46698214215968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce JoyAI-LLM Flash, an efficient Mixture-of-Experts (MoE) language model designed to redefine the trade-off between strong performance and token efficiency in the sub-50B parameter regime. JoyAI-LLM Flash is pretrained on a massive corpus of 20 trillion tokens and further optimized through a rigorous post-training pipeline, including supervised fine-tuning (SFT), Direct Preference Optimization (DPO), and large-scale reinforcement learning (RL) across diverse environments. To improve token efficiency, JoyAI-LLM Flash strategically balances \emph{thinking} and \emph{non-thinking} cognitive modes and introduces FiberPO, a novel RL algorithm inspired by fibration theory that decomposes trust-region maintenance into global and local components, providing unified multi-scale stability control for LLM policy optimization. To enhance architectural sparsity, the model comprises 48B total parameters while activating only 2.7B parameters per forward pass, achieving a substantially higher sparsity ratio than contemporary industry leading models of comparable scale. To further improve inference throughput, we adopt a joint training-inference co-design that incorporates dense Multi-Token Prediction (MTP) and Quantization-Aware Training (QAT). We release the checkpoints for both JoyAI-LLM-48B-A3B Base and its post-trained variants on Hugging Face to support the open-source community.
Abstract（参考訳）: 我々は,50B以下のパラメータ体系において,強力な性能とトークン効率のトレードオフを再定義するために設計された,効率的なMixture-of-Experts(MoE)言語モデルであるJoyAI-LLM Flashを紹介する。 JoyAI-LLM Flashは、20兆のトークンからなる巨大なコーパス上に事前トレーニングされており、教師付き微調整(SFT)、直接優先度最適化(DPO)、さまざまな環境にわたる大規模強化学習(RL)など、厳格な後トレーニングパイプラインを通じてさらに最適化されている。トークン効率を改善するために、JoyAI-LLM Flash は、戦略的に \emph{thinking} と \emph{non-thinking} の認知モードをバランスさせ、FilberPO を導入している。アーキテクチャのスパーシリティを高めるために、モデルには48Bの総パラメータが含まれ、前方通過当たりのパラメータは2.7Bに過ぎず、同規模の現代の業界トップモデルよりもかなり高いスパーシリティ比を達成している。推論スループットをさらに向上するため,高密度マルチトークン予測(MTP)とQAT(Quantization-Aware Training)を組み合わせた共同トレーニング推論共同設計を採用した。オープンソースコミュニティをサポートするために、JoyAI-LLM-48B-A3B BaseとHugging Faceのポストトレーニング後のバージョンの両方のチェックポイントをリリースしました。

論文の概要: JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

関連論文リスト