Related papers: LongCat-Flash-Thinking Technical Report

LongCat-Flash-Thinking Technical Report

URL: http://arxiv.org/abs/2509.18883v1
Date: Tue, 23 Sep 2025 10:25:48 GMT
Title: LongCat-Flash-Thinking Technical Report
Authors: Meituan LongCat Team, Anchun Gui, Bei Li, Bingyang Tao, Bole Zhou, Borun Chen, Chao Zhang, Chao Zhang, Chengcheng Han, Chenhui Yang, Chi Zhang, Chong Peng, Chuyu Zhang, Cong Chen, Fengcun Li, Gang Xu, Guoyuan Lin, Hao Jiang, Hao Liang, Haomin Fu, Haoxiang Ma, Hong Liu, Hongyan Hao, Hongyin Tang, Hongyu Zang, Hongzhi Ni, Hui Su, Jiahao Liu, Jiahuan Li, Jialin Liu, Jianfei Zhang, Jianhao Xu, Jianing Wang, Jiaqi Sun, Jiaqi Zhang, Jiarong Shi, Jiawei Yang, Jingang Wang, Jinrui Ding, Jun Kuang, Jun Xu, Ke He, Kefeng Zhang, Keheng Wang, Keqing He, Li Wei, Liang Shi, Lin Qiu, Lingbin Kong, Lingchuan Liu, Linsen Guo, Longfei An, Mai Xia, Meng Zhou, Mengshen Zhu, Peng Pei, Pengcheng Jia, Qi Gu, Qi Guo, Qiong Huang, Quan Chen, Quanchi Weng, Rongxiang Weng, Ruichen Shao, Rumei Li, Shanglin Lei, Shuai Du, Shuaikang Liu, Shuang Zhou, Shuhao Hu, Siyu Xu, Songshan Gong, Tao Liang, Tianhao Hu, Wei He, Wei Shi, Wei Wang, Wei Wu, Wei Zhuo, Weifeng Tang, Wenjie Shi, Wenlong Zhu, Xi Su, Xiangcheng Liu, Xiangyu Xi, Xiangzhou Huang, Xiao Liu, Xiaochen Jiang, Xiaowei Shi, Xiaowen Shi, Xiaoyu Li, Xin Chen, Xinyue Zhao, Xuan Huang, Xuemiao Zhang, Xuezhi Cao, Xunliang Cai, Yajie Zhang, Yang Chen, Yang Liu, Yang Liu, Yang Zheng, Yaoming Wang, Yaqi Huo, Yerui Sun, Yifan Lu, Yiyang Li, Youshao Xiao, Yuanzhe Lei, Yuchen Xie, Yueqing Sun, Yufei Zhang, Yuhuai Wei, Yulei Qian, Yunke Zhao, Yuqing Ding, Yuwei Jiang, Zhaohua Yang, Zhengyu Chen, Zhijian Liu, Zhikang Xia, Zhongda Su, Ziran Li, Ziwen Wang, Ziyuan Zhuang, Zongyu Wang, Zunyuan Yang,
Abstract summary: LongCat-Flash-Thinking is an efficient open-source Mixture-of-Experts (MoE) reasoning model.<n>Its advanced capabilities are cultivated through a meticulously crafted training process.<n>LongCat-Flash-Thinking achieves state-of-the-art performance among open-source models on a suite of complex reasoning tasks.
Score: 116.75498493511026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present LongCat-Flash-Thinking, an efficient 560-billion-parameter open-source Mixture-of-Experts (MoE) reasoning model. Its advanced capabilities are cultivated through a meticulously crafted training process, beginning with long Chain-of-Thought (CoT) data cold-start and culminating in large-scale Reinforcement Learning (RL). We first employ a well-designed cold-start training strategy, which significantly enhances the reasoning potential and equips the model with specialized skills in both formal and agentic reasoning. Then, a core innovation is our domain-parallel training scheme, which decouples optimization across distinct domains (e.g., STEM, Code, Agentic) and subsequently fuses the resulting expert models into a single, nearly Pareto-optimal model. This entire process is powered by our Dynamic ORchestration for Asynchronous rollout (DORA) system, a large-scale RL framework that delivers a greater than threefold training speedup over synchronous methods on tens of thousands of accelerators. As a result, LongCat-Flash-Thinking achieves state-of-the-art performance among open-source models on a suite of complex reasoning tasks. The model exhibits exceptional efficiency in agentic reasoning, reducing average token consumption by 64.5% (from 19, 653 to 6, 965) on AIME-25, without degrading task accuracy. We release LongCat-Flash-Thinking to promote further advances in reasoning systems and agentic AI research.

Related papers

LongCat-Flash-Thinking-2601 Technical Report [134.89732115690705]
LongCat-Flash-Thinking-2601 is an open-source Mixture-of-Experts (MoE) reasoning model with superior agentic reasoning capability.<n>LongCat-Flash-Thinking-2601 achieves state-of-the-art performance among open-source models on a wide range of agentic benchmarks.
arXiv Detail & Related papers (2026-01-23T13:20:09Z)
LongCat-Flash-Omni Technical Report [131.47284063481922]
LongCat-Flash- Omni is an open-source omni-modal model with 560 billion parameters.<n>LongCat-Flash- Omni attains comprehensive multimodal capabilities while maintaining strong unimodal capability.<n>It achieves low-latency real-time audio-visual interaction.
arXiv Detail & Related papers (2025-10-31T21:58:15Z)
LongCat-Flash Technical Report [165.64670448930875]
LongCat-Flash is a 560-billion- parameter Mixture-of-Experts (MoE) language model.<n>It is designed for both computational efficiency and advanced agentic capabilities.<n>We complete model training on more than 20 trillion tokens within 30 days, while achieving over 100 tokens per second (TPS) for inference at a cost of $0.70 per million output tokens.
arXiv Detail & Related papers (2025-09-01T10:05:45Z)
KAT-V1: Kwai-AutoThink Technical Report [50.84483585850113]
We present Kwaipilot-AutoThink (KAT), an open-source 40B large language model developed to address the overthinking problem in reasoning-intensive tasks.<n>KAT dynamically switches between reasoning and non-reasoning modes based on task complexity.<n>We also propose Step-SRPO, a reinforcement learning algorithm that incorporates intermediate supervision into the GRPO framework.
arXiv Detail & Related papers (2025-07-11T04:07:10Z)
LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Training [32.575669924032276]
Reinforcement Learning (RL) has become the most effective post-training approach for improving the capabilities of Large Language Models (LLMs)<n>We present LlamaRL, a fully distributed, asynchronous RL framework optimized for efficient training of large-scale LLMs.
arXiv Detail & Related papers (2025-05-29T22:14:15Z)
Scaling Laws for Native Multimodal Models [53.490942903659565]
We revisit the architectural design of native multimodal models and conduct an extensive scaling laws study.<n>Our investigation reveals no inherent advantage to late-fusion architectures over early-fusion ones.<n>We show that incorporating Mixture of Experts (MoEs) allows models to learn modality-specific weights, significantly benefiting performance.
arXiv Detail & Related papers (2025-04-10T17:57:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.