Fugu-MT 論文翻訳(概要): E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

論文の概要: E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

arxiv url: http://arxiv.org/abs/2205.14912v3
Date: Tue, 9 Jan 2024 09:44:10 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-10 21:06:11.267600
Title: E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Title（参考訳）: E2S2: 言語理解と生成のためのエンコード強化シーケンス・ツー・シーケンス事前学習
Authors: Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du and Dacheng Tao
Abstract要約: シークエンス・ツー・シークエンス(seq2seq)学習は、大規模事前学習言語モデルにおいて一般的な方法である。本稿では,エンコーディング強化のseq2seq事前学習戦略,すなわちE2S2を提案する。 E2S2は、より効率的な自己教師付き情報をエンコーダに統合することで、Seq2seqモデルを改善する。
参考スコア（独自算出の注目度）: 95.49128988683191
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sequence-to-sequence (seq2seq) learning is a popular fashion for large-scale pretraining language models. However, the prior seq2seq pretraining models generally focus on reconstructive objectives on the decoder side and neglect the effect of encoder-side supervision, which we argue may lead to sub-optimal performance. To verify our hypothesis, we first empirically study the functionalities of the encoder and decoder in seq2seq pretrained language models, and find that the encoder takes an important but under-exploitation role than the decoder regarding the downstream performance and neuron activation. Therefore, we propose an encoding-enhanced seq2seq pretraining strategy, namely E2S2, which improves the seq2seq models via integrating more efficient self-supervised information into the encoders. Specifically, E2S2 adopts two self-supervised objectives on the encoder side from two aspects: 1) locally denoising the corrupted sentence (denoising objective); and 2) globally learning better sentence representations (contrastive objective). With the help of both objectives, the encoder can effectively distinguish the noise tokens and capture high-level (i.e., syntactic and semantic) knowledge, thus strengthening the ability of seq2seq model to accurately achieve the conditional generation. On a large diversity of downstream natural language understanding and generation tasks, E2S2 dominantly improves the performance of its powerful backbone models, e.g., BART and T5. For example, upon BART backbone, we achieve +1.1% averaged gain on the general language understanding evaluation (GLUE) benchmark and +1.75% F_0.5 score improvement on CoNLL2014 dataset. We also provide in-depth analyses to show the improvement stems from better linguistic representation. We hope that our work will foster future self-supervision research on seq2seq language model pretraining.
Abstract（参考訳）: sequence-to-sequence (seq2seq) 学習は、大規模な事前学習言語モデルの流行である。しかし、先述のSeq2seq事前学習モデルは一般にデコーダ側の再構成目的に焦点を合わせ、エンコーダ側の監督効果を無視する。本仮説を検証するために,まず,セク2セック事前学習言語モデルにおけるエンコーダとデコーダの機能について実証研究を行い,下流性能とニューロン活性化に関して,デコーダよりも重要かつ過度な役割を担っていることを確認した。そこで本研究では,より効率的な自己教師付き情報をエンコーダに統合することにより,seq2seqモデルを改善するe2s2という符号化エンハンス付きseq2seqプリトレーニング戦略を提案する。具体的には、E2S2はエンコーダ側の2つの目的を2つの側面から採用している。 1) 腐敗した文(否定目的)を局所的に発音すること,及び 2)より優れた文表現(意味目的)をグローバルに学習する。両目的の助けを借りて、エンコーダはノイズトークンを効果的に識別し、高レベルな(統語的および意味的な)知識を捕捉し、セック2セックモデルの条件生成を正確に達成する能力を強化する。下流の自然言語理解と生成タスクの多様さに対して、E2S2はBARTやT5といった強力なバックボーンモデルの性能を大幅に向上させる。例えば、BARTのバックボーンでは、一般言語理解評価(GLUE)ベンチマークで+1.1%、CoNLL2014データセットで+1.75%のF_0.5スコア改善を達成した。また,言語表現の改善に起因した改良点を詳細に分析する。 seq2seq言語モデルの事前学習に関する今後の自己スーパービジョン研究が促進されることを願っています。

関連論文リスト

Code Representation Learning At Scale [75.04686476303436]
2段階の事前学習スキームを用いて,大量のコードデータを用いてコード表現学習を行う。まず、マスキング言語モデリングにおけるランダム性と、プログラミング言語の構造的側面の両方を活用して、エンコーダを訓練する。そして、教師なしの方法で強陰性かつ強正に構築された対照的な学習を通して表現を強化する。
論文参考訳（メタデータ） (2024-02-02T22:19:15Z)
Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder [75.03283861464365]
seq2seqタスクは、与えられた入力ソースシーケンスに基づいてターゲットシーケンスを生成することを目的としている。伝統的に、seq2seqタスクのほとんどはエンコーダによって解決され、ソースシーケンスとデコーダをエンコードしてターゲットテキストを生成する。最近、デコーダのみの言語モデルをseq2seqタスクに直接適用する、多くの新しいアプローチが出現しました。
論文参考訳（メタデータ） (2023-04-08T15:44:29Z)
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator [114.8954615026781]
本稿では,補助判別器を導入して,エンコーダ・デコーダ事前学習のためのGANスタイルのモデルを提案する。 GanLMは2つのトレーニング済みの目標 – トークン検出の置き換えとトークン記述の置き換え – でトレーニングされている。言語生成ベンチマークの実験では、強力な言語理解能力を持つ GanLM が、様々な強力な事前学習言語モデルより優れていることが示されている。
論文参考訳（メタデータ） (2022-12-20T12:51:11Z)
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages [58.43299730989809]
本稿では,音声データに対するエンコーダ・デコーダモデルの両部分を事前学習するための,最初の自己教師型アプローチであるWav2Seqを紹介する。我々は、コンパクトな離散表現として擬似言語を誘導し、自己教師付き擬似音声認識タスクを定式化する。このプロセスは独自のものであり、低コストの第2段階のトレーニングとして適用することができる。
論文参考訳（メタデータ） (2022-05-02T17:59:02Z)
Improving End-to-End Models for Set Prediction in Spoken Language Understanding [26.781489293420055]
本稿では、音声の順序を推測する暗黙の注意に基づくアライメント手法とともに、新しいデータ拡張手法を提案する。 F1スコアは、RNN-Tでは11%以上、注意に基づくエンコーダデコーダSLUモデルでは2%以上増加し、これまで報告された結果を上回った。
論文参考訳（メタデータ） (2022-01-28T13:23:17Z)
Regularized Training of Nearest Neighbor Language Models [10.994336081018043]
我々は、トレーニングデータ(メモリバンク)を通じて、トレーニング済みの言語モデルと徹底的な$k$NN検索を用いて、最先端の結果を得る、$k$NN-LM citepkhandelwal20 Generalizationを構築した。その結果,L2正則化は低周波ワードの性能を劣化させることなく,高周波ワードの性能を向上させることがわかった。
論文参考訳（メタデータ） (2021-09-16T23:20:24Z)
Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive Text Summarization [15.367455931848252]
抽象テキスト要約のためのコントラスト学習によるシーケンス・ツー・シーケンス(seq2seq)オートエンコーダを提案する。本モデルは,多層双方向エンコーダと自動回帰デコーダを備えた標準トランスフォーマーアーキテクチャを採用する。 2つのデータセットで実験を行い、我々のモデルが既存のベンチマークより優れていることを示す。
論文参考訳（メタデータ） (2021-08-26T18:45:13Z)
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition [9.732767611907068]
本研究では,前訓練音響エンコーダ(wav2vec2.0)と前訓練言語エンコーダ(bert)をエンドツーエンドasrモデルに融合する。本モデルは他のエンドツーエンドモデルに比べてcallhomeコーパスの認識性能が(15時間)向上する。
論文参考訳（メタデータ） (2021-01-17T16:12:44Z)
Orthros: Non-autoregressive End-to-end Speech Translation with Dual-decoder [64.55176104620848]
NARと自己回帰(AR)デコーダの両方を共有音声エンコーダで共同で訓練する新しいNAR E2E-STフレームワークOrthrosを提案する。後者は、前者から生成される様々な長の候補間のより良い翻訳を選択するために使用され、これは、無視できるオーバーヘッドを持つ大きな長のビームの有効性を劇的に向上させる。 4つのベンチマーク実験により、競合翻訳品質を維持しつつ、推論速度を向上させる手法の有効性が示された。
論文参考訳（メタデータ） (2020-10-25T06:35:30Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。