Fugu-MT 論文翻訳(概要): Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

論文の概要: Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination

arxiv url: http://arxiv.org/abs/2311.02960v2
Date: Tue, 9 Jan 2024 16:16:34 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-10 19:47:32.730593
Title: Understanding Deep Representation Learning via Layerwise Feature Compression and Discrimination
Title（参考訳）: 階層的特徴圧縮と識別による深層表現学習の理解
Authors: Peng Wang, Xiao Li, Can Yaras, Zhihui Zhu, Laura Balzano, Wei Hu, and Qing Qu
Abstract要約: 深層線形ネットワークの各層は、幾何速度でクラス内特徴を徐々に圧縮し、線形速度でクラス間特徴を識別することを示す。これは、ディープ線形ネットワークの階層的表現における特徴進化の最初の定量的評価である。
参考スコア（独自算出の注目度）: 33.273226655730326
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Over the past decade, deep learning has proven to be a highly effective tool for learning meaningful features from raw data. However, it remains an open question how deep networks perform hierarchical feature learning across layers. In this work, we attempt to unveil this mystery by investigating the structures of intermediate features. Motivated by our empirical findings that linear layers mimic the roles of deep layers in nonlinear networks for feature learning, we explore how deep linear networks transform input data into output by investigating the output (i.e., features) of each layer after training in the context of multi-class classification problems. Toward this goal, we first define metrics to measure within-class compression and between-class discrimination of intermediate features, respectively. Through theoretical analysis of these two metrics, we show that the evolution of features follows a simple and quantitative pattern from shallow to deep layers when the input data is nearly orthogonal and the network weights are minimum-norm, balanced, and approximate low-rank: Each layer of the linear network progressively compresses within-class features at a geometric rate and discriminates between-class features at a linear rate with respect to the number of layers that data have passed through. To the best of our knowledge, this is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks. Empirically, our extensive experiments not only validate our theoretical results numerically but also reveal a similar pattern in deep nonlinear networks which aligns well with recent empirical studies. Moreover, we demonstrate the practical implications of our results in transfer learning. Our code is available at \url{https://github.com/Heimine/PNC_DLN}.
Abstract（参考訳）: 過去10年間で、ディープラーニングは生データから有意義な特徴を学習するための非常に効果的なツールであることが証明された。しかし、ディープ・ネットワークが階層的特徴学習を階層的に階層的に行うのかという疑問は依然として残っている。本研究では,中間的特徴の構造を解明し,この謎を明らかにする。線形層が非線形ネットワークにおける深層の役割を模倣して特徴学習を行うという経験的知見に動機づけられ,マルチクラス分類問題における学習後の各層の出力(特徴)を調査し,ディープリニアネットワークが入力データを出力に変換する方法について検討した。この目的に向けて,まず,クラス内圧縮の測定指標と中間機能のクラス間識別をそれぞれ定義する。 Through theoretical analysis of these two metrics, we show that the evolution of features follows a simple and quantitative pattern from shallow to deep layers when the input data is nearly orthogonal and the network weights are minimum-norm, balanced, and approximate low-rank: Each layer of the linear network progressively compresses within-class features at a geometric rate and discriminates between-class features at a linear rate with respect to the number of layers that data have passed through. 私たちの知る限りでは、ディープリニアネットワークの階層表現における特徴進化の定量的な特徴付けはこれが初めてである。実験により, 実験結果の数値的検証だけでなく, 最近の実験結果とよく一致する深い非線形ネットワークにおいても類似したパターンが得られた。さらに, 転校学習における結果の実際的意義を実証する。私たちのコードは \url{https://github.com/Heimine/PNC_DLN} で利用可能です。

関連論文リスト

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
本稿では, テンソル勾配プログラム(SGD)フレームワークを用いた$L$層ニューラルネットワークのトレーニング力学について検討する。 SGDにより、これらのネットワークが初期値から大きく逸脱する線形独立な特徴を学習できることを示す。このリッチな特徴空間は、関連するデータ情報をキャプチャし、トレーニングプロセスの収束点が世界最小であることを保証する。
論文参考訳（メタデータ） (2025-03-12T17:33:13Z)
Approximating Latent Manifolds in Neural Networks via Vanishing Ideals [20.464009622419766]
我々は, 無限イデアルがディープネットワークの潜在多様体をいかに特徴付けるかを示すことによって, 多様体学習と計算代数学の関連性を確立する。本稿では,中間層で事前学習されたネットワークを切断し,消滅するイデアルのジェネレータを介して各クラス多様体を近似するニューラルアーキテクチャを提案する。得られたモデルは、トレーニング済みのベースラインよりも著しく少ないが、同等の精度を維持し、高いスループットを実現し、パラメータが少ない。
論文参考訳（メタデータ） (2025-02-20T21:23:02Z)
Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks [13.983863226803336]
我々は「機能平均化」がディープニューラルネットワークの非ロバスト性に寄与する主要な要因の1つであると論じる。二層分類タスクのための2層ReLUネットワークにおいて、勾配降下のトレーニング力学を詳細に理論的に解析する。よりきめ細かい教師付き情報を提供することで、2層多層ニューラルネットワークが個々の特徴を学習できることを実証する。
論文参考訳（メタデータ） (2024-10-14T09:28:32Z)
Neural Collapse in the Intermediate Hidden Layers of Classification Neural Networks [0.0]
(NC)は、分類ニューラルネットワークの最終的な隠蔽層におけるクラスの表現を正確に記述する。本稿では,中間層におけるNCの出現を包括的に解析する。
論文参考訳（メタデータ） (2023-08-05T01:19:38Z)
Understanding Deep Neural Networks via Linear Separability of Hidden Layers [68.23950220548417]
まず,ミンコフスキー差分に基づく線形分離性尺度(MD-LSMs)を提案し,2点集合の線形分離性度を評価する。隠れ層出力の線形分離度とネットワークトレーニング性能との間には同期性があることを実証する。
論文参考訳（メタデータ） (2023-07-26T05:29:29Z)
Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers [0.0]
トレーニング手法の深層ネットワーク性能への影響について検討する。本稿では,全てのネットワークレイヤの出力を含むエラー関数を誘導するニューラルネットワークアーキテクチャを提案する。
論文参考訳（メタデータ） (2023-06-09T10:52:49Z)
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks [49.808194368781095]
3層ニューラルネットワークは,2層ネットワークよりも特徴学習能力が豊富であることを示す。この研究は、特徴学習体制における2層ネットワーク上の3層ニューラルネットワークの証明可能なメリットを理解するための前進である。
論文参考訳（メタデータ） (2023-05-11T17:19:30Z)
A Law of Data Separation in Deep Learning [41.58856318262069]
ニューラルネットワークが中間層でどのようにデータを処理しているかという根本的な問題について検討する。私たちの発見は、ディープニューラルネットワークがクラスメンバーシップに従ってどのようにデータを分離するかを規定する、シンプルで定量的な法則です。
論文参考訳（メタデータ） (2022-10-31T02:25:38Z)
Rank Diminishing in Deep Neural Networks [71.03777954670323]
ニューラルネットワークのランクは、層をまたがる情報を測定する。これは機械学習の幅広い領域にまたがる重要な構造条件の例である。しかし、ニューラルネットワークでは、低ランク構造を生み出す固有のメカニズムはあいまいで不明瞭である。
論文参考訳（メタデータ） (2022-06-13T12:03:32Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
ニューラルネットが問題の未特定化を扱う方法が,データ表現に大きく依存していることを示す。深層学習におけるアーキテクチャ的インダクティブバイアスの理解は,これらのシステムの公平性,堅牢性,一般化に対処する上で基本的であることを強調した。
論文参考訳（メタデータ） (2021-04-29T14:31:09Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
本稿では、DS2CF-Netと呼ばれる、拡張された事前制約付きDual-Constrained Deep Semi-Supervised Coupled Factorization Networkを提案する。隠れた深い特徴を抽出するために、DS2CF-Netは、深い構造と幾何学的な構造に制約のあるニューラルネットワークとしてモデル化される。我々のネットワークは、表現学習とクラスタリングのための最先端の性能を得ることができる。
論文参考訳（メタデータ） (2020-09-08T13:10:21Z)
ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image Classification [49.87503122462432]
ReMarNet(Relation-and-Margin Learning Network)と呼ばれるニューラルネットワークを導入する。本手法は,上記2つの分類機構の双方において優れた性能を発揮する特徴を学習するために,異なるバックボーンの2つのネットワークを組み立てる。 4つの画像データセットを用いた実験により,本手法はラベル付きサンプルの小さな集合から識別的特徴を学習するのに有効であることが示された。
論文参考訳（メタデータ） (2020-06-27T13:50:20Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。