Fugu-MT 論文翻訳(概要): QianfanHuijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs

論文の概要: QianfanHuijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs

arxiv url: http://arxiv.org/abs/2512.24314v1
Date: Tue, 30 Dec 2025 16:10:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-01 23:27:28.422717
Title: QianfanHuijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs
Title（参考訳）: Qianfan Huijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs
Authors: Shupeng Li, Weipeng Lu, Linyun Liu, Chen Lin, Shaofei Li, Zhendong Tan, Hanjun Zhong, Yucheng Zeng, Chenghao Zhu, Mengyue Liu, Daxiang Dong, Jianmin Wu, Yunting Xiao, Annan Li, Danyu Liu, Jingnan Zhang, Licen Liu, Dawei Yin, Dou Shen,
Abstract要約: 本稿では、金融分野のLLMであるQianfan Huijinを紹介し、産業モデル強化のための一般化可能な多段階トレーニングパラダイムを提案する。我々のアプローチは、知識基盤を統合するための財務コーパスの継続事前学習(CPT)から始まります。金融SFTから始まり、ファイナンス推論RLとファイナンスエージェントRLに進み、ジェネラルRLで終了する。
参考スコア（独自算出の注目度）: 30.6564068779509
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Domain-specific enhancement of Large Language Models (LLMs) within the financial context has long been a focal point of industrial application. While previous models such as BloombergGPT and Baichuan-Finance primarily focused on knowledge enhancement, the deepening complexity of financial services has driven a growing demand for models that possess not only domain knowledge but also robust financial reasoning and agentic capabilities. In this paper, we present QianfanHuijin, a financial domain LLM, and propose a generalizable multi-stage training paradigm for industrial model enhancement. Our approach begins with Continual Pre-training (CPT) on financial corpora to consolidate the knowledge base. This is followed by a fine-grained Post-training pipeline designed with increasing specificity: starting with Financial SFT, progressing to Finance Reasoning RL and Finance Agentic RL, and culminating in General RL aligned with real-world business scenarios. Empirical results demonstrate that QianfanHuijin achieves superior performance across various authoritative financial benchmarks. Furthermore, ablation studies confirm that the targeted Reasoning RL and Agentic RL stages yield significant gains in their respective capabilities. These findings validate our motivation and suggest that this fine-grained, progressive post-training methodology is poised to become a mainstream paradigm for various industrial-enhanced LLMs.
Abstract（参考訳）: 金融環境における大規模言語モデル(LLM)のドメイン固有化は、長い間産業的応用の焦点であった。 BloombergGPTやBaichuan-Financeといった以前のモデルは、主に知識の強化に重点を置いていたが、金融サービスの複雑さの深化は、ドメイン知識だけでなく、堅牢な財務推論とエージェント能力を持つモデルへの需要を増大させてきた。本稿では、金融分野のLLMであるQianfan Huijinを紹介し、産業モデル強化のための一般化可能な多段階トレーニングパラダイムを提案する。我々のアプローチは、知識基盤を統合するための財務コーパスの継続事前学習(CPT)から始まります。ファイナンシャル・SFTから、ファイナンシャル・推論・RLとファイナンシャル・エージェント・RLに進み、現実のビジネスシナリオに合わせたジェネラル・RLに到達した。実証的な結果は、Qianfan Huijinが様々な信頼できる財務ベンチマークで優れたパフォーマンスを達成していることを示している。さらに, アブレーション研究により, 標的となるReasoning RLおよびAgenic RLは, それぞれの能力において有意な利得をもたらすことが確認された。これらの知見は我々の動機を実証し、この細粒で進歩的なポストトレーニング手法が様々な産業用LLMの主流となることを示唆している。

論文の概要: QianfanHuijin Technical Report: A Novel Multi-Stage Training Paradigm for Finance Industrial LLMs

関連論文リスト