Fugu-MT 論文翻訳(概要): Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression

論文の概要: Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression

arxiv url: http://arxiv.org/abs/2603.27383v1
Date: Sat, 28 Mar 2026 19:29:38 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-31 23:18:44.929978
Title: Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression
Title（参考訳）: Decompose, Mix, Adapt: パラメータ効率の良いニューラルネットワーク再結合と圧縮のための統一フレームワーク
Authors: Nazia Tasnim, Shrimai Prabhumoye, Bryan A. Plummer,
Abstract要約: 補間共有基底射影(CRISP)による係数ゲート重み再結合を提案する。 CRISPは同じフレームワーク内で複数のPRタスクをシームレスに統合する。実験の結果、CRISPは2タスクアプリケーションを4-5%高速化する以前の方法よりも優れていた。
参考スコア（独自算出の注目度）: 30.925082859761215
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Parameter Recombination (PR) methods aim to efficiently compose the weights of a neural network for applications like Parameter-Efficient FineTuning (PEFT) and Model Compression (MC), among others. Most methods typically focus on one application of PR, which can make composing them challenging. For example, when deploying a large model you may wish to compress the model and also quickly adapt to new settings. However, PEFT methods often can still contain millions of parameters. This may be small compared to the original model size, but can be problematic in resource constrained deployments like edge devices, where they take a larger portion of the compressed model's parameters. To address this, we present Coefficient-gated weight Recombination by Interpolated Shared basis Projections (CRISP), a general approach that seamlessly integrates multiple PR tasks within the same framework. CRISP accomplishes this by factorizing pretrained weights into basis matrices and their component mixing projections. Sharing basis matrices across layers and adjusting its size enables us to perform MC, whereas the mixer weight's small size (fewer than 200 in some experiments) enables CRISP to support PEFT. Experiments show CRISP outperforms methods from prior work capable of dual-task applications by 4-5\% while also outperforming the state-of-the-art in PEFT by 1.5\% and PEFT+MC combinations by 1\%. Our code is available on the repository: https://github.com/appledora/CRISP-CVPR26.
Abstract（参考訳）: パラメータ再結合(PR)法は、パラメータ効率の良いファインタニング(PEFT)やモデル圧縮(MC)などの応用のために、ニューラルネットワークの重みを効率的に構成することを目的としている。ほとんどのメソッドは一般的にPRの1つのアプリケーションに重点を置いています。例えば、大きなモデルをデプロイする場合、モデルを圧縮し、新しい設定に迅速に適応したい場合もあります。しかし、PEFT法は、しばしば数百万のパラメータを含むことができる。これはオリジナルのモデルサイズに比べて小さいかもしれないが、圧縮されたモデルのパラメータの大部分を取るエッジデバイスのようなリソース制約のあるデプロイメントでは問題となる可能性がある。そこで本研究では,複数のPRタスクをシームレスに統合する汎用手法であるCRISP(Interpolated Shared basis Projections)を提案する。 CRISPは、事前学習した重量を基底行列とその成分混合射影に分解することでこれを達成している。層間の基底行列の共有とサイズ調整はMCを実現するのに対して、ミキサー重量の小さいサイズ(一部の実験では200以下)はPEFTをサポートするのに役立ちます。実験の結果、CRISPは従来の作業で2タスクを4～5倍に向上し、PEFTでは1.5倍、PEFT+MCの組み合わせでは1倍に向上した。私たちのコードはリポジトリで利用可能です。

論文の概要: Decompose, Mix, Adapt: A Unified Framework for Parameter-Efficient Neural Network Recombination and Compression

関連論文リスト