Fugu-MT 論文翻訳(概要): Cross Entropy in Deep Learning of Classifiers Is Unnecessary -- ISBE Error is All You Need

論文の概要: Cross Entropy in Deep Learning of Classifiers Is Unnecessary -- ISBE Error is All You Need

arxiv url: http://arxiv.org/abs/2311.16357v1
Date: Mon, 27 Nov 2023 22:40:02 GMT
ステータス: 翻訳完了
システム内更新日: 2023-11-29 20:49:35.771738
Title: Cross Entropy in Deep Learning of Classifiers Is Unnecessary -- ISBE Error is All You Need
Title（参考訳）: 分類器の深層学習におけるクロスエントロピー - ISBEエラーは必要なすべて
Authors: Wladyslaw Skarbek
Abstract要約: ディープラーニング分類器では、コスト関数は通常、SoftMaxとCrossEntropy関数の組み合わせの形を取る。この研究はISBE機能を導入し、クロスエントロピー計算の冗長性に関する論文を正当化する。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In deep learning classifiers, the cost function usually takes the form of a combination of SoftMax and CrossEntropy functions. The SoftMax unit transforms the scores predicted by the model network into assessments of the degree (probabilities) of an object's membership to a given class. On the other hand, CrossEntropy measures the divergence of this prediction from the distribution of target scores. This work introduces the ISBE functionality, justifying the thesis about the redundancy of cross entropy computation in deep learning of classifiers. Not only can we omit the calculation of entropy, but also, during back-propagation, there is no need to direct the error to the normalization unit for its backward transformation. Instead, the error is sent directly to the model's network. Using examples of perceptron and convolutional networks as classifiers of images from the MNIST collection, it is observed for ISBE that results are not degraded with SoftMax only, but also with other activation functions such as Sigmoid, Tanh, or their hard variants HardSigmoid and HardTanh. Moreover, up to three percent of time is saved within the total time of forward and backward stages. The article is addressed mainly to programmers and students interested in deep model learning. For example, it illustrates in code snippets possible ways to implement ISBE units, but also formally proves that the softmax trick only applies to the class of softmax functions with relocations.
Abstract（参考訳）: ディープラーニング分類器では、コスト関数は通常、SoftMaxとCrossEntropy関数の組み合わせの形を取る。ソフトマックスユニットは、モデルネットワークによって予測されるスコアを、対象のメンバシップの度合い(確率)を所定のクラスに変換する。一方、クロスエントロピーは、目標スコアの分布からこの予測の発散を測定する。本研究は、分類器の深層学習におけるクロスエントロピー計算の冗長性に関する論文を正当化するisbe機能を導入する。エントロピーの計算を省略できるだけでなく、バックプロパゲーションの間、その後方変換のために正規化単位にエラーを指示する必要がない。その代わりに、エラーはモデルのネットワークに直接送られる。パーセプトロンと畳み込みネットワークの例をMNISTコレクションの画像の分類器として用いて、ISBEは結果がSoftMaxだけでなく、Sigmoid、Tanh、あるいはハード変種であるHardSigmoid、HardTanhといった他のアクティベーション関数で劣化することが観察されている。さらに、前段と後段の合計時間内に最大3%の時間が節約される。この記事は主に、深層モデル学習に関心を持つプログラマと学生を対象としている。例えば、isbeユニットの実装方法をコードスニペットで示すが、softmaxのトリックが再配置のあるsoftmax関数のクラスにのみ適用されることを正式に証明している。

関連論文リスト

Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
凸最適化問題を解くための新しい勾配のないアルゴリズムを提案する。このような問題は医学、物理学、機械学習で発生する。両種類の雑音下で提案アルゴリズムの収束保証を行う。
論文参考訳（メタデータ） (2024-11-21T10:26:17Z)
Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class Incremental Learning [120.53458753007851]
FSCIL(Few-shot class-incremental Learning)は、新しいセッションにおいて、新しいクラスごとにいくつかのトレーニングサンプルしかアクセスできないため、難しい問題である。我々は最近発見された神経崩壊現象にインスパイアされたFSCILのこの不整合ジレンマに対処する。我々は、FSCILのための神経崩壊誘発フレームワークを提案する。MiniImageNet、CUB-200、CIFAR-100データセットの実験により、提案したフレームワークが最先端のパフォーマンスより優れていることを示す。
論文参考訳（メタデータ） (2023-02-06T18:39:40Z)
Maximally Compact and Separated Features with Regular Polytope Networks [22.376196701232388]
本稿では, CNN から抽出する方法として, クラス間分離性とクラス間圧縮性の特性について述べる。我々は、よく知られた citewen2016discriminative や他の類似したアプローチで得られる特徴と類似した特徴を得る。
論文参考訳（メタデータ） (2023-01-15T15:20:57Z)
Distinction Maximization Loss: Efficiently Improving Classification Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply Replacing the Loss and Calibrating [2.262407399039118]
我々は、DisMax損失を用いた決定論的深層ニューラルネットワークのトレーニングを提案する。 DisMaxは通常、分類精度、不確実性推定、推論効率、アウト・オブ・ディストリビューション検出において、全ての現在のアプローチを同時に上回る。
論文参考訳（メタデータ） (2022-05-12T04:37:35Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
本研究では、ニューラルネットワークを学習して分類器をランダムにETFとして分類し、訓練中に固定する可能性について検討する。実験結果から,バランスの取れたデータセットの画像分類において,同様の性能が得られることがわかった。
論文参考訳（メタデータ） (2022-03-17T04:34:28Z)
X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
ディープラーニングにおける分類と回帰設定の両面でのデータ効率の向上を目標とする。両世界の力を生かすために,我々は新しいX-モデルを提案する。 X-モデルは、特徴抽出器とタスク固有のヘッドの間でミニマックスゲームを行う。
論文参考訳（メタデータ） (2021-10-09T13:56:48Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
暗黙のニューラルネットワークは、精度の向上とメモリ消費の大幅な削減を示す。彼らは不利な姿勢と収束の不安定さに悩まされる。本論文は,ニューラルネットワークを高機能かつ頑健に設計するための新しい枠組みを提供する。
論文参考訳（メタデータ） (2021-06-06T18:05:02Z)
Query Training: Learning a Worse Model to Infer Better Marginals in Undirected Graphical Models with Hidden Variables [11.985433487639403]
確率的グラフィカルモデル(PGM)は、柔軟な方法でクエリできる知識のコンパクトな表現を提供する。我々は,PGMを学習するメカニズムであるクエリトレーニング(QT)を導入し,それと組み合わせる近似推論アルゴリズムに最適化する。実験により,QTを用いて隠れ変数を持つ8連結グリッドマルコフランダム場を学習できることが実証された。
論文参考訳（メタデータ） (2020-06-11T20:34:32Z)
Aligned Cross Entropy for Non-Autoregressive Machine Translation [120.15069387374717]
非自己回帰モデルの学習における代替的損失関数としてアライメントクロスエントロピー(AXE)を提案する。 AXEに基づく条件付きマスキング言語モデル(CMLM)のトレーニングは、主要なWMTベンチマークの性能を大幅に向上させる。
論文参考訳（メタデータ） (2020-04-03T16:24:47Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。