Fugu-MT 論文翻訳(概要): Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales

論文の概要: Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales

arxiv url: http://arxiv.org/abs/2509.23574v1
Date: Sun, 28 Sep 2025 02:09:07 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:19.29946
Title: Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales
Title（参考訳）: CoTの高効率蒸留に向けて:低合理化による自己ガイド型Rationale Selectorの高性能化
Authors: Jianzhi Yan, Le Liu, Youcheng Pan, Shiwei Chen, Yang Xiang, Buzhou Tang,
Abstract要約: チェイン・オブ・シント(CoT)蒸留は, より大規模な教師モデルから多段階の推論能力を伝達することにより, 小型言語モデル(SLM)推論を強化することを目的としている。既存の作業は、主にデータ量に焦点を当てた合理的な品質を過小評価しており、ノイズや誤った情報を学生モデルに転送することができる。 textbfModel-textbfOriented textbfRationale textbfSelection textbfDistillation (MoRSD)を提案する。
参考スコア（独自算出の注目度）: 21.91556878201084
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Chain-of-thought (CoT) distillation aims to enhance small language models' (SLMs) reasoning by transferring multi-step reasoning capability from the larger teacher models. However, existing work underestimates rationale quality, focusing primarily on data quantity, which may transfer noisy or incorrect information to the student model. To address the above issues, we proposed \textbf{M}odel-\textbf{O}riented \textbf{R}ationale \textbf{S}election \textbf{D}istillation (MoRSD), which can discern and select high quality rationales for distillation to improve performance further. We further propose a Rationale Difficulty (RD) metric to measure the ability of the student model to generate the correct answer under a given rationale. Compared to the baseline, we achieved 4.6$\%$ average improvement on seven datasets over three tasks, using fewer rationales by controlling their accuracy, diversity, and difficulty. Our results reveal that a small portion of the high quality rationales can enhance the reasoning ability of student models than the entire dataset. Our method promises to be a possible solution for efficient CoT distillation. Our code will be released in https://github.com/Leon221220/MoRSD.
Abstract（参考訳）: チェイン・オブ・シント(CoT)蒸留は, より大規模な教師モデルから多段階の推論能力を伝達することにより, 小型言語モデル(SLM)推論を強化することを目的としている。しかし、既存の研究は、主にデータ量に焦点を当てた合理的な品質を過小評価しており、ノイズや誤った情報を学生モデルに転送することができる。以上の問題に対処するため,蒸留における高品質な理論的根拠を識別・選択し,さらなる性能向上を図ることを目的とした, \textbf{M}odel-\textbf{O}riented \textbf{R}ationale \textbf{S}election \textbf{D}istillation (MoRSD)を提案した。さらに,Rationale Difficulty(RD)尺度を提案し,与えられた論理の下で正しい解を生成する学生モデルの能力を測定する。ベースラインと比較すると,3つのタスクに対する7つのデータセットの平均改善率は4.6$\%であった。結果から,高品質な理性理論のごく一部は,データセット全体よりも学生モデルの推論能力を高めることができることがわかった。提案手法は, 効率の良いCoT蒸留方法として期待できる。私たちのコードはhttps://github.com/Leon221220/MoRSDでリリースされます。

論文の概要: Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales

関連論文リスト