Fugu-MT 論文翻訳(概要): Where Does Social Reasoning Come From? Capability Provenance in Language Models

論文の概要: Where Does Social Reasoning Come From? Capability Provenance in Language Models

arxiv url: http://arxiv.org/abs/2606.19625v1
Date: Wed, 17 Jun 2026 22:06:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.55707
Title: Where Does Social Reasoning Come From? Capability Provenance in Language Models
Title（参考訳）: ソーシャル推論はどこから来るのか? 言語モデルにおける能力保証
Authors: Glenn Matlin, Chandreyi Chakraborty, Saehee Eom, Mika Okamoto, Rayan Castilla, Louis Jaburi, Alvin Deng, Taywon Min, Lucia Quirke, Stella Biderman, Mark Riedl,
Abstract要約: OLMo3-7Bにおいて,プレトレーニングコーパスのどの領域が社会的推論とSTEM的推論をサポートするかを示す。トレーニングデータ属性は、各トレーニング文書がベンチマークにおけるモデルの予測にどれほど強く影響するかを測定する。分離したDolma3混合系から引き出された作業集合に対する勾配に基づく帰属性を計算する。
参考スコア（独自算出の注目度）: 11.7652444083388
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We use training-data attribution as an interpretable tool for capability discovery, mapping which regions of the pretraining corpus support social-reasoning versus STEM-reasoning in OLMo3-7B. Training-data attribution measures how strongly each training document influences a model's predictions on a benchmark, but document-level scores are too noisy to identify which corpus regions support which capabilities, and prior work has emphasized factual knowledge rather than reasoning. We compute gradient-based attribution (TrackStar via Bergson) over a working set drawn from the de-duplicated Dolma3 mix, aggregate influence across WebOrganizer's 24-format x 24-topic taxonomy (576 bins), and contrast benchmark pairs in a 2x2 design that varies domain (social vs. STEM) and capability type (reasoning vs. knowledge): SocialIQA and MMLU Social Sciences against ARC-Challenge and MMLU STEM. Social and STEM reasoning draw on qualitatively distinct corpus regions, and the contrast is sharper at the reasoning level than at the knowledge level. Targeted machine unlearning provides partial causal validation: forgetting high-attribution topic bins (e.g., Literature for SocialIQA) degrades the aligned benchmark more than within-bin random baselines, and we open-source all code, sampling manifests, the bin-level influence matrix, and unlearning checkpoints.
Abstract（参考訳）: 我々は,OLMo3-7Bにおける学習データ属性を,学習前コーパスのどの領域が社会的推論とSTEM推論をサポートするかをマッピングする,能力発見のための解釈可能なツールとして利用する。トレーニングデータ属性は、各トレーニング資料がベンチマークにおけるモデルの予測にどれほど強く影響するかを測定するが、ドキュメントレベルのスコアは、どのコーパス領域がどの機能をサポートするかを特定するにはうるさすぎる。我々は、分離されたDolma3ミックスから引き出された作業セット、WebOrganizerの24-format x 24-topic Taxonomy(576 bins)の総合的な影響、ドメイン(社会対STEM)と機能タイプ(推論対知識):SocialIQAとMMLU Social Sciences対ARC-ChallengeとMMLU STEMの2x2デザインにおけるコントラストベンチマークペアについて、勾配に基づく属性(TrackStar via Bergson)を計算する。社会的およびSTEM推論は、定性的に異なるコーパス領域で引き起こされ、そのコントラストは知識レベルよりも推論レベルでシャープである。ハイアトリビューションなトピック bins(例えば、SocialIQAのための文学)を忘れると、アライメントされたベンチマークがインインインインインインインランダムベースラインよりも劣化し、すべてのコードをオープンソースにし、マニフェストをサンプリングし、ビンレベルの影響行列とアンラーニングチェックポイントを出力します。

論文の概要: Where Does Social Reasoning Come From? Capability Provenance in Language Models

関連論文リスト