Fugu-MT 論文翻訳(概要): Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks

論文の概要: Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks

arxiv url: http://arxiv.org/abs/2204.02892v1
Date: Wed, 6 Apr 2022 15:16:27 GMT
ステータス: 翻訳完了
システム内更新日: 2022-04-07 13:58:59.471995
Title: Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks
Title（参考訳）: シーケンスからシーケンスへの学習を可能にするサブタスク分解
Authors: Noam Wies, Yoav Levine, Amnon Shashua
Abstract要約: 入力とシーケンス・ツー・シーケンス・モデルとの中間的監督を連結すると、学習不能な複合問題が発生する。我々はこれをビットサブセットパリティの難解な合成タスクとして証明する。我々の成果は、中間的監督の利益に関する結果のランドスケープにおける最初のものである。
参考スコア（独自算出の注目度）: 16.182561312622315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The field of Natural Language Processing (NLP) has experienced a dramatic leap in capabilities with the recent introduction of huge Language Models (LMs). Despite this success, natural language problems that involve several compounded steps are still practically unlearnable, even by the largest LMs. This complies with experimental failures for end-to-end learning of composite problems that were demonstrated in a variety of domains. A known mitigation is to introduce intermediate supervision for solving sub-tasks of the compounded problem. Recently, several works have demonstrated high gains by taking a straightforward approach for incorporating intermediate supervision in compounded natural language problems: the sequence-to-sequence LM is fed with an augmented input, in which the decomposed tasks' labels are simply concatenated to the original input. In this paper, we prove a positive learning result that motivates these recent efforts. We show that when concatenating intermediate supervision to the input and training a sequence-to-sequence model on this modified input, an unlearnable composite problem becomes learnable. We prove this for the notoriously unlearnable composite task of bit-subset parity, with the intermediate supervision being parity results of increasingly large bit-subsets. Beyond motivating contemporary empirical efforts for incorporating intermediate supervision in sequence-to-sequence language models, our positive theoretical result is the first of its kind in the landscape of results on the benefits of intermediate supervision: Until now, all theoretical results on the subject are negative, i.e., show cases where learning is impossible without intermediate supervision, while our result is positive, showing a case where learning is facilitated in the presence of intermediate supervision.
Abstract（参考訳）: 自然言語処理(NLP)の分野は、最近の巨大な言語モデル(LM)の導入によって、能力の飛躍的な飛躍を経験した。この成功にもかかわらず、いくつかの複雑なステップを含む自然言語の問題は、最大規模のLMでも、いまだに学習不可能である。これは様々な領域で証明された複合問題のエンドツーエンド学習に実験的に失敗する結果となる。既知の緩和策は、複合問題のサブタスクを解決するための中間監督を導入することである。近年, 複合自然言語問題に中間管理を組み込むことにより, 高い利得を示す研究がいくつかある。シーケンス・ツー・シーケンス LM は, 分割されたタスクのラベルを元の入力と簡単に結合した拡張入力で供給される。本稿では,最近の取り組みを動機づけるポジティブな学習結果を示す。中間監督を入力に連結し、この修正された入力に対してシーケンス・ツー・シーケンスモデルを訓練すると、理解不能な複合問題を学習できることを示す。我々はこれを、ビットサブセットパリティの難解な合成タスクとして証明し、中間的監督はますます大きなビットサブセットのパリティ結果である。 Beyond motivating contemporary empirical efforts for incorporating intermediate supervision in sequence-to-sequence language models, our positive theoretical result is the first of its kind in the landscape of results on the benefits of intermediate supervision: Until now, all theoretical results on the subject are negative, i.e., show cases where learning is impossible without intermediate supervision, while our result is positive, showing a case where learning is facilitated in the presence of intermediate supervision.

関連論文リスト

Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability [18.54202114336492]
マルチラウンド自動回帰モデルの近似,学習可能性,一般化特性について検討する。有限コンテキストウィンドウを持つ変換器はチューリング計算可能関数のステップに対する普遍近似器であることを示す。我々はPAC学習をシーケンス生成に拡張し、シーケンス長がモデルのコンテキストウィンドウを超えた場合でも、マルチラウンド生成が学習可能であることを示す。
論文参考訳（メタデータ） (2025-03-05T02:50:55Z)
Mitigating Interference in the Knowledge Continuum through Attention-Guided Incremental Learning [17.236861687708096]
Attention-Guided Incremental Learning' (AGILE)は、タスク間の干渉を効果的に軽減するために、コンパクトなタスク注意を組み込んだリハーサルベースのCLアプローチである。 AGILEは、タスク干渉を緩和し、複数のCLシナリオにおいてリハーサルベースのアプローチより優れていることで、一般化性能を著しく向上する。
論文参考訳（メタデータ） (2024-05-22T20:29:15Z)
Language Model Cascades: Token-level uncertainty and beyond [65.38515344964647]
言語モデル(LM)の最近の進歩により、複雑なNLPタスクの品質が大幅に向上した。 Cascadingは、より好ましいコスト品質のトレードオフを達成するためのシンプルな戦略を提供する。トークンレベルの不確実性を学習後遅延ルールに組み込むことで,単純な集約戦略を著しく上回ることを示す。
論文参考訳（メタデータ） (2024-04-15T21:02:48Z)
Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning [14.677411619418319]
補助的なタスクは、データが乏しい、あるいは焦点の主タスクが極めて複雑である状況での学習を容易にする。 Detauxと呼ばれる新しいフレームワークを提案する。このフレームワークでは,非関連性のある新たな補助的分類タスクを見つけるために,弱い教師付き逆絡手順が使用される。我々は、最も不整合な部分空間上のクラスタリング手順によって補助的な分類タスクを生成し、ラベルの離散的な集合を得る。
論文参考訳（メタデータ） (2023-10-13T17:40:39Z)
Narrowing the Gap between Supervised and Unsupervised Sentence Representation Learning with Large Language Model [44.77515147970206]
文表現学習(SRL)は自然言語処理(NLP)の基本課題である CSE(Contrastive Learning of Sentence Embeddings)はその優れたパフォーマンスのために主流のテクニックである。以前の作品では、このパフォーマンスギャップは2つの表現特性(配向と均一性)の違いに起因するとされていた。
論文参考訳（メタデータ） (2023-09-12T08:16:58Z)
Topic-driven Distant Supervision Framework for Macro-level Discourse Parsing [72.14449502499535]
テキストの内部修辞構造を解析する作業は、自然言語処理において難しい問題である。近年のニューラルモデルの発展にもかかわらず、トレーニングのための大規模で高品質なコーパスの欠如は大きな障害となっている。近年の研究では、遠方の監督を用いてこの制限を克服しようと試みている。
論文参考訳（メタデータ） (2023-05-23T07:13:51Z)
Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models [81.01397924280612]
大規模言語モデル (LLM) は、ステップ・バイ・ステップ・チェーン・オブ・シークレット (CoT) をデモンストレーションとして組み込むことで、様々な推論タスクにおいて高い効果的な性能を達成することができる。本稿では,イターCoT (Iterative bootstrapping in Chain-of-Thoughts Prompting) を導入する。
論文参考訳（メタデータ） (2023-04-23T13:54:39Z)
HaLP: Hallucinating Latent Positives for Skeleton-based Self-Supervised Learning of Actions [69.14257241250046]
ラベルなしの骨格に基づく行動認識のためのモデル学習のための新しいコントラスト学習手法を提案する。私たちの重要な貢献は、単純なモジュールであるHalucinate Latent Positivesのコントラスト学習へのHalucinate HaLPです。実験を通して、標準のコントラスト学習フレームワーク内でこれらの生成した正を使用すれば、一貫した改善がもたらされることを示す。
論文参考訳（メタデータ） (2023-04-01T21:09:43Z)
Beyond Distributional Hypothesis: Let Language Models Learn Meaning-Text Correspondence [45.9949173746044]
大規模事前学習言語モデル (PLM) が論理否定特性 (LNP) を満たさないことを示す。そこで本研究では,意味テキスト対応を直接学習するための新しい中間訓練課題である「意味マッチング」を提案する。このタスクにより、PLMは語彙意味情報を学習することができる。
論文参考訳（メタデータ） (2022-05-08T08:37:36Z)
BLISS: Robust Sequence-to-Sequence Learning via Self-Supervised Input Representation [92.75908003533736]
本稿では,自己教師型入力表現を用いたフレームワークレベルの頑健なシーケンス・ツー・シーケンス学習手法BLISSを提案する。我々は,機械翻訳,文法的誤り訂正,テキスト要約など,BLISSの様々なタスクにおける有効性を検証するための総合的な実験を行った。
論文参考訳（メタデータ） (2022-04-16T16:19:47Z)
Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation [71.70562795158625]
従来のNLPは、高レベルセマンティック言語理解(LU)の成功に必要な構文解析を長い間保持(教師付き)してきた。近年のエンドツーエンドニューラルネットワークの出現、言語モデリング(LM)による自己監視、および幅広いLUタスクにおける成功は、この信念に疑問を投げかけている。本研究では,LM-Pretrained Transformer Network の文脈における意味的LUに対する教師あり構文解析の有用性を実証的に検討する。
論文参考訳（メタデータ） (2020-08-15T21:03:36Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。