Fugu-MT 論文翻訳(概要): Adaptable Adapters

論文の概要: Adaptable Adapters

arxiv url: http://arxiv.org/abs/2205.01549v1
Date: Tue, 3 May 2022 14:59:27 GMT
ステータス: 翻訳完了
システム内更新日: 2022-05-04 13:46:48.788077
Title: Adaptable Adapters
Title（参考訳）: アダプタ
Authors: Nafise Sadat Moosavi, Quentin Delfosse, Kristian Kersting, Iryna Gurevych
Abstract要約: 最先端のNLPモデルには1億から1兆のパラメータが含まれる。適応アダプタは異なる層と異なる入力データに対して異なるアクティベーション関数を含む。適応型アダプタは,標準アダプタアーキテクチャを用いてオンパー性能を実現する。
参考スコア（独自算出の注目度）: 74.65986170056945
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of pretrained weights. Adapter layers are initialized randomly. However, existing work uses the same adapter architecture -- i.e., the same adapter layer on top of each layer of the pretrained model -- for every dataset, regardless of the properties of the dataset or the amount of available training data. In this work, we introduce adaptable adapters that contain (1) learning different activation functions for different layers and different input data, and (2) a learnable switch to select and only use the beneficial adapter layers. We show that adaptable adapters achieve on-par performances with the standard adapter architecture while using a considerably smaller number of adapter layers. In addition, we show that the selected adapter architecture by adaptable adapters transfers well across different data settings and similar tasks. We propose to use adaptable adapters for designing efficient and effective adapter architectures. The resulting adapters (a) contain about 50% of the learning parameters of the standard adapter and are therefore more efficient at training and inference, and require less storage space, and (b) achieve considerably higher performances in low-data settings.
Abstract（参考訳）: 最先端のNLPモデルには1億から1兆のパラメータが含まれる。アダプタは、事前訓練された重みの上に軽量ニューラルネットワーク層のみを微調整できる、完全な微調整のためのパラメーター効率のよい代替手段を提供する。アダプタ層はランダムに初期化される。しかしながら、既存の作業では、データセットの特性や利用可能なトレーニングデータの量に関わらず、データセット毎に、同じアダプタアーキテクチャ、すなわち、事前トレーニングされたモデルの各レイヤの上に同じアダプタレイヤを使用する。本研究では,(1)異なるレイヤと異なる入力データに対して異なるアクティベーション関数を学習する,(2)学習可能なスイッチを選択・使用するための適応可能なアダプタを提案する。アダプタ層をかなり少ない数で使用しながら,標準アダプタアーキテクチャと同等の性能を実現できることを示す。さらに,適応可能なアダプタによって選択されたアダプタアーキテクチャが,異なるデータ設定や同様のタスク間でうまく転送されることを示す。本稿では,適応型アダプタを,効率的かつ効率的なアダプタアーキテクチャの設計に用いることを提案する。結果として生じるアダプタ (a)標準アダプタの学習パラメータの約50%を含み、訓練や推論において効率が良く、ストレージスペースも少ない。 b)低データ設定でかなり高い性能を達成する。

関連論文リスト

Adapters Strike Back [10.490880056507198]
我々は、アダプタ、内部構造、および様々な実装選択について詳細に研究する。我々は、Adapter+と呼ばれる具体的かつ改良されたアダプタアーキテクチャを提案する。
論文参考訳（メタデータ） (2024-06-10T22:07:57Z)
Stylus: Automatic Adapter Selection for Diffusion Models [81.90482700433822]
本稿では,プロンプトのキーワードに基づいて,タスク固有のアダプタを効率的に選択し,自動生成するStylusを紹介する。 Stylus氏はまず、改善された記述と埋め込みでアダプタを要約し、関連するアダプタを検索し、さらにプロンプトのキーワードに基づいてアダプタを組み立てる3段階のアプローチを概説している。
論文参考訳（メタデータ） (2024-04-29T17:59:16Z)
Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models [12.230087530720652]
本稿では,大規模マルチタスク適応シナリオにおいて,より効率的なアダプタモジュールを提案する。アダプタは単一の共有コントローラネットワークと複数のタスクレベルのアダプタヘッドで構成されている。
論文参考訳（メタデータ） (2024-03-25T17:21:56Z)
MerA: Merging Pretrained Adapters For Few-Shot Learning [71.44422347502409]
モデル融合により,事前学習したアダプタを単一モデルに効率的に組み込むことができるtextbftextttMerging Pretrained Adapters (MerA)を提案する。 2つのPLMの実験では、MerAはシングルアダプタとAdapterFusionの両方と比較して大幅に改善されている。
論文参考訳（メタデータ） (2023-08-30T12:10:17Z)
AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models [119.7093605087114]
大規模なトレーニング済み言語モデルをダウンストリームタスクに微調整するには、数億のパラメータを更新する必要がある。これにより、各タスクのモデルの重みの大量コピーを格納するためのサービスコストが増大するだけでなく、数発のタスク適応中に不安定を示す。パラメータや計算コストを2つの重要な手法で増大させることなく、アダプタ容量を改善するための新しいメカニズムを導入する。
論文参考訳（メタデータ） (2022-05-24T23:41:22Z)
AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks [55.705355299065474]
数百万のパラメータを持つトランスフォーマーベースの事前学習モデルは、大きなストレージを必要とする。近年のアプローチでは、アダプタをトレーニングすることでこの欠点に対処しているが、それでも比較的多くのパラメータを必要とする。本研究では,驚くほどシンプルで効果的なアダプタアーキテクチャであるAdapterBiasを提案する。
論文参考訳（メタデータ） (2022-04-30T16:49:41Z)
AdapterHub: A Framework for Adapting Transformers [148.6877231725939]
AdapterHubは、さまざまなタスクや言語のためのトレーニング済みアダプタの動的"スティッチイン"を可能にするフレームワークである。我々のフレームワークは、タスク固有のモデルの共有にスケーラブルで簡単にアクセスできる。
論文参考訳（メタデータ） (2020-07-15T15:56:05Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。