Fugu-MT 論文翻訳(概要): Few-Shot Learning via Learning the Representation, Provably

論文の概要: Few-Shot Learning via Learning the Representation, Provably

arxiv url: http://arxiv.org/abs/2002.09434v2
Date: Tue, 30 Mar 2021 04:06:04 GMT
ステータス: 翻訳完了
システム内更新日: 2022-12-30 01:11:48.554231
Title: Few-Shot Learning via Learning the Representation, Provably
Title（参考訳）: 表現を学習し, 証明可能な, 数少ない学習
Authors: Simon S. Du, Wei Hu, Sham M. Kakade, Jason D. Lee, Qi Lei
Abstract要約: 本稿では,表現学習による少数ショット学習について検討する。 1つのタスクは、ターゲットタスクのサンプルの複雑さを減らすために、$T$ソースタスクと$n_1$データを使用して表現を学習する。
参考スコア（独自算出の注目度）: 115.7367053639605
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper studies few-shot learning via representation learning, where one uses $T$ source tasks with $n_1$ data per task to learn a representation in order to reduce the sample complexity of a target task for which there is only $n_2 (\ll n_1)$ data. Specifically, we focus on the setting where there exists a good \emph{common representation} between source and target, and our goal is to understand how much of a sample size reduction is possible. First, we study the setting where this common representation is low-dimensional and provide a fast rate of $O\left(\frac{\mathcal{C}\left(\Phi\right)}{n_1T} + \frac{k}{n_2}\right)$; here, $\Phi$ is the representation function class, $\mathcal{C}\left(\Phi\right)$ is its complexity measure, and $k$ is the dimension of the representation. When specialized to linear representation functions, this rate becomes $O\left(\frac{dk}{n_1T} + \frac{k}{n_2}\right)$ where $d (\gg k)$ is the ambient input dimension, which is a substantial improvement over the rate without using representation learning, i.e. over the rate of $O\left(\frac{d}{n_2}\right)$. This result bypasses the $\Omega(\frac{1}{T})$ barrier under the i.i.d. task assumption, and can capture the desired property that all $n_1T$ samples from source tasks can be \emph{pooled} together for representation learning. Next, we consider the setting where the common representation may be high-dimensional but is capacity-constrained (say in norm); here, we again demonstrate the advantage of representation learning in both high-dimensional linear regression and neural network learning. Our results demonstrate representation learning can fully utilize all $n_1T$ samples from source tasks.
Abstract（参考訳）: 本稿では,1タスクあたり$n_1$のデータを持つ$t$ソースタスクを使用して表現を学習し,$n_2 (\ll n_1)$データしか存在しない対象タスクのサンプル複雑性を低減する,表現学習による少数ショット学習について検討する。具体的には、ソースとターゲットの間に良い \emph{common representation} が存在するような設定に焦点を合わせ、サンプルサイズの削減がどの程度可能かを理解することを目的とする。まず、この共通表現が低次元であるような設定を研究し、より速いレートで $o\left(\frac{\mathcal{c}\left(\phi\right)}{n_1t} + \frac{k}{n_2}\right)$; ここで、$\phi$ は表現関数クラス、$\mathcal{c}\left(\phi\right)$ はその複雑性測度であり、$k$ はその表現の次元である。線型表現関数に特化すると、このレートは$O\left(\frac{dk}{n_1T} + \frac{k}{n_2}\right)$となる。この結果は、i.d.タスクの仮定の下で$\Omega(\frac{1}{T})$バリアをバイパスし、ソースタスクからのすべての$n_1T$サンプルが、表現学習のために一緒に \emph{pooled} とすることができる。次に,共通表現が高次元でキャパシティ制約のある設定を考える(例:ノルム)。ここでは,高次元線形回帰とニューラルネットワーク学習の両方において,表現学習の利点を再度示す。その結果、表現学習はソースタスクから得られるすべての$n_1T$サンプルをフル活用できることを示した。

関連論文リスト

Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples [24.45016514352055]
我々は、関数クラス$mathcal F times Mathcal G$から、T+1$関数$f_star(t) circ g_star$を学習する際のサンプル複雑度について研究する。タスク数が$T$になるにつれて、サンプル要件とリスクバウンドの両方が$r$次元回帰に収束することを示す。
論文参考訳（メタデータ） (2024-10-15T03:20:19Z)
Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis [45.05072391903122]
情報指数は、オンライン勾配降下のサンプルの複雑さを予測する上で重要な役割を果たす。マルチインデックスモデルでは、最低度のみに焦点を合わせることで、重要な構造の詳細を見逃すことができる。 2次項と高次項の両方を考慮することで、まず2次項から関連する空間を学習できることが示される。
論文参考訳（メタデータ） (2024-10-13T00:14:08Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
単一インデックス対象関数 $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ の勾配勾配勾配学習問題について検討する。 SGDに基づくアルゴリズムにより最適化された2層ニューラルネットワークは、情報指数に支配されない複雑さで$f_*$を学習する。
論文参考訳（メタデータ） (2024-06-03T17:56:58Z)
Metalearning with Very Few Samples Per Task [19.78398372660794]
タスクが共有表現によって関連づけられるバイナリ分類について検討する。ここでは、データ量は、見る必要のあるタスク数$t$と、タスク当たりのサンプル数$n$で測定されます。我々の研究は、分布のないマルチタスク学習の特性とメタとマルチタスク学習の削減をもたらす。
論文参考訳（メタデータ） (2023-12-21T16:06:44Z)
Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
3層ニューラルネットワークを用いた標準ガウス分布における階層関数の学習問題について検討する。次数$k$s$p$の大規模なサブクラスの場合、正方形損失における階層的勾配によるトレーニングを受けた3層ニューラルネットワークは、テストエラーを消すためにターゲット$h$を学習する。この研究は、3層ニューラルネットワークが複雑な特徴を学習し、その結果、幅広い階層関数のクラスを学ぶ能力を示す。
論文参考訳（メタデータ） (2023-11-23T02:19:32Z)
Bottleneck Structure in Learned Features: Low-Dimension vs Regularity Tradeoff [12.351756386062291]
低次元表現の学習と特徴写像の複雑性/不規則性の最小化のバランスを定式化する。大深度の場合、ほとんどすべての隠れ表現はおよそ$R(0)(f)$次元であり、ほとんど全ての重み行列は$W_ell$が$R(0)(f)$特異値である。興味深いことに、大きな学習率の使用は、ほぼすべての層の表現の無限深度収束を保証する注文$O(L)$ NTKを保証するために要求される。
論文参考訳（メタデータ） (2023-05-30T13:06:26Z)
Multi-Task Imitation Learning for Linear Dynamical Systems [50.124394757116605]
線形システム上での効率的な模倣学習のための表現学習について検討する。学習対象ポリシーによって生成された軌道上の模倣ギャップは、$tildeOleft(frack n_xHN_mathrmshared + frack n_uN_mathrmtargetright)$で制限されている。
論文参考訳（メタデータ） (2022-12-01T00:14:35Z)
Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
特定の状況下では、勾配降下によって訓練されたニューラルネットワークは、カーネルメソッドのように振る舞う。実際には、ニューラルネットワークが関連するカーネルを強く上回ることが知られている。
論文参考訳（メタデータ） (2022-06-30T09:24:02Z)
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation [89.21686761957383]
2層ネットワークにおける第1層パラメータ $boldsymbolW$ の勾配降下ステップについて検討した。我々の結果は、一つのステップでもランダムな特徴に対してかなりの優位性が得られることを示した。
論文参考訳（メタデータ） (2022-05-03T12:09:59Z)
On the Power of Multitask Representation Learning in Linear MDP [61.58929164172968]
本稿では,線形マルコフ決定過程(MDP)におけるマルチタスク表現学習の統計的メリットについて分析する。簡単な最小二乗アルゴリズムが $tildeO(H2sqrtfrackappa MathcalC(Phi)2 kappa dNT+frackappa dn) というポリシーを学ぶことを証明した。
論文参考訳（メタデータ） (2021-06-15T11:21:06Z)
Categorical Representation Learning: Morphism is All You Need [0.0]
分類表現学習のための構築と「$textitcategorifier$」の基礎について紹介する。データセット内のすべてのオブジェクト$mathcals$は$textitencoding map$ $e: mathcalobj(mathcals)tomathbbrn$によって$mathbbrn$のベクトルとして表現できる。概念実証として,我々の技術を搭載したテキスト翻訳者の例を示し,分類的学習モデルがそれを上回ることを示す。
論文参考訳（メタデータ） (2021-03-26T23:47:15Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。