Fugu-MT 論文翻訳(概要): Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

論文の概要: Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

arxiv url: http://arxiv.org/abs/2508.17681v2
Date: Tue, 26 Aug 2025 05:04:10 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-27 13:17:04.078761
Title: Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery
Title（参考訳）: アブレーションとしてのアンラーニング : 生成科学的発見のためのFalsibility Benchmarkを目指して
Authors: Robert Yang,
Abstract要約: 大きな言語モデル(LLM)は本当に新しい知識を生成するのか、それとも単に記憶された断片をリミックスするだけなのか? 建設科学的発見のフレーバーとして,非学習的アズ・アブレーションを提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bold claims about AI's role in science-from "AGI will cure all diseases" to promises of radically accelerated discovery-raise a central epistemic question: do large language models (LLMs) truly generate new knowledge, or do they merely remix memorized fragments? We propose unlearning-as-ablation as a falsifiable probe of constructive scientific discovery. The idea is to systematically remove a target result together with its forget-closure (supporting lemmas, paraphrases, and multi-hop entailments) and then evaluate whether the model can re-derive the result from only permitted axioms and tools. Success would indicate generative capability beyond recall; failure would expose current limits. Unlike prevailing motivations for unlearning-privacy, copyright, or safety-our framing repositions it as an epistemic probe for AI-for-Science. We outline a minimal pilot in mathematics and algorithms to illustrate feasibility, and sketch how the same approach could later be extended to domains such as physics or chemistry. This is a position paper: our contribution is conceptual and methodological, not empirical. We aim to stimulate discussion on how principled ablation tests could help distinguish models that reconstruct knowledge from those that merely retrieve it, and how such probes might guide the next generation of AI-for-Science benchmarks.
Abstract（参考訳）: 大規模な言語モデル(LLM)は本当に新しい知識を生成するのか、あるいは単に記憶された断片をリミックスするのだろうか? 建設科学的発見のフレーバーとして,非学習的アズ・アブレーションを提案する。その考え方は、目標とする結果と、その忘れられたクロージャ(補題、パラフレーズ、マルチホップを含む)を体系的に取り除き、モデルが許容される公理とツールのみから結果を導出できるかどうかを評価することである。成功はリコール以上の生成能力を示し、障害は現在の限界を露呈する。未学習のプライバシ、著作権、安全のためのフレーミングといった一般的なモチベーションとは違って、AIの科学への探究の手段として位置づけられている。我々は、実現可能性を説明するために、数学とアルゴリズムの最小限のパイロットを概説し、同じアプローチが後に物理学や化学のような領域に拡張される可能性についてスケッチした。私たちの貢献は概念的で方法論的であり、経験的ではありません。我々は、原理的アブレーションテストが、知識を単に取得するモデルと、それを再構築するモデルとを区別する上でどのように役立つか、そしてそのようなプローブが、次世代のAI科学ベンチマークをどのように導くか、という議論を刺激することを目指している。

論文の概要: Unlearning as Ablation: Toward a Falsifiable Benchmark for Generative Scientific Discovery

関連論文リスト