Fugu-MT 論文翻訳(概要): Towards Understanding Feature Learning in Parameter Transfer

論文の概要: Towards Understanding Feature Learning in Parameter Transfer

arxiv url: http://arxiv.org/abs/2509.22056v1
Date: Fri, 26 Sep 2025 08:37:54 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 20:57:54.305896
Title: Towards Understanding Feature Learning in Parameter Transfer
Title（参考訳）: パラメータ伝達における特徴学習の理解に向けて
Authors: Hua Yuan, Xuran Meng, Qiufeng Wang, Shiyu Xia, Ning Xu, Xu Yang, Jing Wang, Xin Geng, Yong Rui,
Abstract要約: 上流モデルと下流モデルの両方がReLU畳み込みニューラルネットワーク(CNN)である設定を解析する。我々は、継承されたパラメータが普遍的な知識のキャリアとしてどのように振る舞うかを特徴付け、目的のタスクに対するそれらの有益な影響を増幅する重要な要因を識別する。我々の分析は、ある場合において、新しいモデルをスクラッチからトレーニングするよりも、パラメータの転送がターゲットタスクのテスト精度を低下させる可能性がある理由を洞察する。
参考スコア（独自算出の注目度）: 47.063219231351916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Parameter transfer is a central paradigm in transfer learning, enabling knowledge reuse across tasks and domains by sharing model parameters between upstream and downstream models. However, when only a subset of parameters from the upstream model is transferred to the downstream model, there remains a lack of theoretical understanding of the conditions under which such partial parameter reuse is beneficial and of the factors that govern its effectiveness. To address this gap, we analyze a setting in which both the upstream and downstream models are ReLU convolutional neural networks (CNNs). Within this theoretical framework, we characterize how the inherited parameters act as carriers of universal knowledge and identify key factors that amplify their beneficial impact on the target task. Furthermore, our analysis provides insight into why, in certain cases, transferring parameters can lead to lower test accuracy on the target task than training a new model from scratch. Numerical experiments and real-world data experiments are conducted to empirically validate our theoretical findings.
Abstract（参考訳）: パラメータ転送は、上流モデルと下流モデルの間でモデルパラメータを共有することによって、タスクとドメイン間の知識再利用を可能にする、トランスファーラーニングにおける中心的なパラダイムである。しかし、上流モデルからのパラメータのサブセットのみを下流モデルに移す場合、そのような部分的パラメータの再利用が有用である条件と、その有効性を管理する要因について理論的には理解されていない。このギャップに対処するために、上流モデルと下流モデルの両方がReLU畳み込みニューラルネットワーク(CNN)である設定を解析する。この理論的枠組みの中では、継承されたパラメータが普遍的な知識のキャリアとしてどのように振る舞うかを特徴付け、目的のタスクに対するそれらの有益な影響を増幅する重要な要因を特定する。さらに,パラメータの移動が,新しいモデルをスクラッチからトレーニングするよりも,目標タスクにおけるテスト精度を低下させる原因を考察した。理論的知見を実証的に検証するために,数値実験と実世界のデータ実験を行った。

論文の概要: Towards Understanding Feature Learning in Parameter Transfer

関連論文リスト