Fugu-MT 論文翻訳(概要): Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining

論文の概要: Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining

arxiv url: http://arxiv.org/abs/2011.08626v2
Date: Thu, 19 Nov 2020 12:43:58 GMT
ステータス: 翻訳完了
システム内更新日: 2022-09-24 16:39:39.658178
Title: Neural Semi-supervised Learning for Text Classification Under Large-Scale Pretraining
Title（参考訳）: 大規模事前学習によるテキスト分類のためのニューラル半教師付き学習
Authors: Zijun Sun, Chun Fan, Xiaofei Sun, Yuxian Meng, Fei Wu and Jiwei Li
Abstract要約: 我々は、大規模LM事前学習の文脈下で、テキスト分類タスクにおける半教師あり学習の研究を行う。我々の研究は、大規模事前学習の文脈下でのセミ教師付き学習モデルの振る舞いを理解するための最初のステップである。
参考スコア（独自算出の注目度）: 51.19885385587916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The goal of semi-supervised learning is to utilize the unlabeled, in-domain dataset U to improve models trained on the labeled dataset D. Under the context of large-scale language-model (LM) pretraining, how we can make the best use of U is poorly understood: is semi-supervised learning still beneficial with the presence of large-scale pretraining? should U be used for in-domain LM pretraining or pseudo-label generation? how should the pseudo-label based semi-supervised model be actually implemented? how different semi-supervised strategies affect performances regarding D of different sizes, U of different sizes, etc. In this paper, we conduct comprehensive studies on semi-supervised learning in the task of text classification under the context of large-scale LM pretraining. Our studies shed important lights on the behavior of semi-supervised learning methods: (1) with the presence of in-domain pretraining LM on U, open-domain LM pretraining is unnecessary; (2) both the in-domain pretraining strategy and the pseudo-label based strategy introduce significant performance boosts, with the former performing better with larger U, the latter performing better with smaller U, and the combination leading to the largest performance boost; (3) self-training (pretraining first on pseudo labels D' and then fine-tuning on D) yields better performances when D is small, while joint training on the combination of pseudo labels D' and the original dataset D yields better performances when D is large. Using semi-supervised learning strategies, we are able to achieve a performance of around 93.8% accuracy with only 50 training data points on the IMDB dataset, and a competitive performance of 96.6% with the full IMDB dataset. Our work marks an initial step in understanding the behavior of semi-supervised learning models under the context of large-scale pretraining.
Abstract（参考訳）: 大規模言語モデル(LM)事前学習の文脈では、どのようにしてUを最大限に活用できるかは理解されていない。ドメイン内のLMプリトレーニングや擬似ラベル生成に使用するべきか? 擬似ラベルベースの半教師付きモデルを実際にどのように実装すべきか? 異なるサイズのD、異なるサイズのUなどに関するパフォーマンスに、いかに異なる半教師付き戦略が影響するか。本稿では,大規模LM事前学習におけるテキスト分類作業における半教師あり学習の包括的研究を行う。 Our studies shed important lights on the behavior of semi-supervised learning methods: (1) with the presence of in-domain pretraining LM on U, open-domain LM pretraining is unnecessary; (2) both the in-domain pretraining strategy and the pseudo-label based strategy introduce significant performance boosts, with the former performing better with larger U, the latter performing better with smaller U, and the combination leading to the largest performance boost; (3) self-training (pretraining first on pseudo labels D' and then fine-tuning on D) yields better performances when D is small, while joint training on the combination of pseudo labels D' and the original dataset D yields better performances when D is large. 半教師付き学習戦略を用いることで、IMDBデータセット上で50のトレーニングデータポイントしか持たず、約93.8%の精度で、完全なIMDBデータセットで96.6%の競争性能が得られる。我々の研究は、大規模事前学習の文脈下でのセミ教師付き学習モデルの振る舞いを理解するための最初のステップである。

関連論文リスト

Reasoning to Learn from Latent Thoughts [45.59740535714148]
そこで本研究では,テキスト生成プロセスの根底にある潜在的思考を明示的にモデル化し,推論することにより,事前学習データの効率を大幅に向上できることを示す。 1B LMは、少なくとも3回の反復でその性能をブートストラップし、生データに基づいてトレーニングされたベースラインを大幅に上回ることを示す。推論スケーリングとEMイテレーションのメリットは、データ制約付き事前トレーニングをスケールする新たな機会を示唆している。
論文参考訳（メタデータ） (2025-03-24T16:41:23Z)
Learning with Noisy Foundation Models [95.50968225050012]
本論文は、事前学習データセットにおけるノイズの性質を包括的に理解し分析する最初の研究である。雑音の悪影響を緩和し、一般化を改善するため、特徴空間に適応するチューニング法(NMTune)を提案する。
論文参考訳（メタデータ） (2024-03-11T16:22:41Z)
A Large-scale Evaluation of Pretraining Paradigms for the Detection of Defects in Electroluminescence Solar Cell Images [3.729242965449096]
この研究は、太陽電池欠陥検出のための様々な事前学習手法の大規模評価とベンチマークである。セマンティックセグメンテーション、半教師あり学習、そして2つの自己教師あり技術を用いて教師あり訓練を網羅する。我々はSCDDのための新しい最先端技術を実現し、特定の事前学習スキームが、表現不足のクラスにおいて優れたパフォーマンスをもたらすことを示す。
論文参考訳（メタデータ） (2024-02-27T15:37:15Z)
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks [91.15120211190519]
本稿では、事前学習データセットにおけるノイズの性質を理解し、下流タスクへの影響を軽減することを目的とする。雑音の悪影響を軽減するために特徴空間に適応する軽量ブラックボックスチューニング法(NMTune)を提案する。
論文参考訳（メタデータ） (2023-09-29T06:18:15Z)
Generation-driven Contrastive Self-training for Zero-shot Text Classification with Instruction-following LLM [31.25193238045053]
我々は、より小さな言語モデルの訓練を支援するために、大規模言語モデルの強力な生成力を利用する新しい手法、GenCoを導入する。本手法では,LLMは2つの重要な方法で,より小さなモデルの自己学習ループにおいて重要な役割を果たす。予測ラベルに条件付き入力テキストを書き換えることで、高品質なトレーニングペアの開発を支援する。
論文参考訳（メタデータ） (2023-04-24T07:35:38Z)
An Efficient Active Learning Pipeline for Legal Text Classification [2.462514989381979]
法律分野における事前学習言語モデルを用いて,能動的学習を効果的に活用するためのパイプラインを提案する。我々は、知識蒸留を用いてモデルの埋め込みを意味論的意味のある空間に導く。分類タスクに適応したContract-NLIとLEDGARベンチマークの実験により,本手法が標準AL戦略より優れていることが示された。
論文参考訳（メタデータ） (2022-11-15T13:07:02Z)
A semi-supervised Teacher-Student framework for surgical tool detection and localization [2.41710192205034]
外科的ツール検出のパラダイムにおいて,半教師付き学習(SSL)フレームワークを導入する。提案手法では,教師-学生共同学習を初期化するラベル付きデータを用いたモデルを訓練する。 m2cai16-tool-locations データセットの結果は、異なる教師付きデータ設定に対するアプローチの優位性を示している。
論文参考訳（メタデータ） (2022-08-21T17:21:31Z)
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [54.55281692768765]
トランスフォーマーに基づく教師付き事前訓練は、人物再識別(ReID)において大きなパフォーマンスを達成する ImageNetとReIDデータセットのドメインギャップのため、通常、パフォーマンスを高めるために、より大きなトレーニング済みデータセットが必要です。この研究は、データとモデル構造の観点から、事前トレーニングデータセットとReIDデータセットのギャップを軽減することを目的としている。
論文参考訳（メタデータ） (2021-11-23T18:59:08Z)
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets [74.11825654535895]
大規模未ラベルテキストデータ上での事前学習言語モデル(LM)により、ダウンストリームのパフォーマンスが極めて容易になる。我々は,事前学習データに含まれる特定の特徴について,セマンティクス以外では,下流タスクのスクラッチからトレーニングしたデータよりも,事前学習したLMを優れているか検討した。
論文参考訳（メタデータ） (2021-09-08T10:39:57Z)
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
バイオメディカルおよびコンピュータサイエンスの出版物、ニュース、レビュー)と8つの分類タスクについて調査する。ドメイン内の事前トレーニング(ドメイン適応型事前トレーニング)の第2フェーズでは、パフォーマンスが向上する。タスクの未ラベルデータ(タスク適応事前トレーニング)に適応することで、ドメイン適応事前トレーニング後のパフォーマンスが向上する。
論文参考訳（メタデータ） (2020-04-23T04:21:19Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。