Fugu-MT 論文翻訳(概要): Efficient Beam Tree Recursion

論文の概要: Efficient Beam Tree Recursion

arxiv url: http://arxiv.org/abs/2307.10779v1
Date: Thu, 20 Jul 2023 11:29:17 GMT
ステータス: 翻訳完了
システム内更新日: 2023-07-21 13:29:46.318531
Title: Efficient Beam Tree Recursion
Title（参考訳）: 効率的なビームツリー再帰
Authors: Jishnu Ray Chowdhury, Cornelia Caragea
Abstract要約: Beam Tree Recursive Neural Network (BT-RvNN) はGumbel Tree RvNNの単純な拡張として提案されている。 BT-RvNNのメモリ使用量を10-16ドルで削減する戦略を提案する。
参考スコア（独自算出の注目度）: 54.958581892688095
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Beam Tree Recursive Neural Network (BT-RvNN) was recently proposed as a simple extension of Gumbel Tree RvNN and it was shown to achieve state-of-the-art length generalization performance in ListOps while maintaining comparable performance on other tasks. However, although not the worst in its kind, BT-RvNN can be still exorbitantly expensive in memory usage. In this paper, we identify the main bottleneck in BT-RvNN's memory usage to be the entanglement of the scorer function and the recursive cell function. We propose strategies to remove this bottleneck and further simplify its memory usage. Overall, our strategies not only reduce the memory usage of BT-RvNN by $10$-$16$ times but also create a new state-of-the-art in ListOps while maintaining similar performance in other tasks. In addition, we also propose a strategy to utilize the induced latent-tree node representations produced by BT-RvNN to turn BT-RvNN from a sentence encoder of the form $f:\mathbb{R}^{n \times d} \rightarrow \mathbb{R}^{d}$ into a sequence contextualizer of the form $f:\mathbb{R}^{n \times d} \rightarrow \mathbb{R}^{n \times d}$. Thus, our proposals not only open up a path for further scalability of RvNNs but also standardize a way to use BT-RvNNs as another building block in the deep learning toolkit that can be easily stacked or interfaced with other popular models such as Transformers and Structured State Space models.
Abstract（参考訳）: Beam Tree Recursive Neural Network (BT-RvNN)は、最近、Gumbel Tree RvNNの単純な拡張として提案され、他のタスクで同等のパフォーマンスを維持しながら、ListOpsの最先端長一般化性能を達成することが示されている。しかし、BT-RvNNは、その種類では最悪のものではないが、メモリ使用量では極端に高価である。本稿では,BT-RvNNのメモリ使用量の主なボトルネックは,スコア機能と再帰的セル機能の絡み合いであることを示す。我々は、このボトルネックを取り除き、メモリ使用をさらに単純化する戦略を提案する。全体的に、BT-RvNNのメモリ使用量を10-16ドル倍に削減するだけでなく、他のタスクでも同様のパフォーマンスを維持しながら、ListOpsに新たな最先端技術を作成します。さらに、bt-rvnnが生成する遅延木ノード表現を用いて、$f:\mathbb{r}^{n \times d} \rightarrow \mathbb{r}^{d}$を$f:\mathbb{r}^{n \times d} \rightarrow \mathbb{r}^{n \times d} \rightarrow \mathbb{r}^{n \times d}$という形の文エンコーダからbt-rvnnを変換する方法も提案する。したがって、我々の提案はRvNNのさらなる拡張のための道を開くだけでなく、TransformersやStructured State Spaceモデルといった他の一般的なモデルと簡単に積み重ねたりインターフェースしたりできるディープラーニングツールキットの別のビルディングブロックとしてBT-RvNNを使用する方法を標準化する。

関連論文リスト

Optimal Gradient Checkpointing for Sparse and Recurrent Architectures using Off-Chip Memory [0.8321953606016751]
本稿では,スパースRNNとスパイキングニューラルネットワークの一般クラスに適したメモリ効率の高い勾配チェックポイント戦略を提案する。再計算のオーバーヘッドを最小限に抑えながら、ローカルメモリリソースの使用を最適化し、Double Checkpointingが最も効果的な方法であることが判明した。
論文参考訳（メタデータ） (2024-12-16T14:23:31Z)
GhostRNN: Reducing State Redundancy in RNN with Cheap Operations [66.14054138609355]
本稿では,効率的なRNNアーキテクチャであるGhostRNNを提案する。 KWSとSEタスクの実験により、提案されたGhostRNNはメモリ使用量(40%)と計算コストを大幅に削減し、性能は類似している。
論文参考訳（メタデータ） (2024-11-20T11:37:14Z)
Topology-aware Embedding Memory for Continual Learning on Expanding Networks [63.35819388164267]
本稿では,メモリリプレイ技術を用いて,メモリ爆発問題に対処する枠組みを提案する。 Topology-aware Embedding Memory (TEM) を用いたPDGNNは最先端技術よりも優れている。
論文参考訳（メタデータ） (2024-01-24T03:03:17Z)
Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability [76.62673276574668]
バイナリバランスツリーRvNN(BBT-RvNNs)は、バランスの取れたバイナリツリー構造に従ってシーケンス合成を実行する。 BBT-RvNNはLong Range Arena (LRA)のようなロングシーケンスタスクにおいて効率的かつスケーラブルであるリストOpsで成功するRvNN(例:ビームツリーRvNN)は、一般的にRNNよりも数倍高い。
論文参考訳（メタデータ） (2023-11-08T04:20:56Z)
Towards Zero Memory Footprint Spiking Neural Network Training [7.4331790419913455]
スパイキングニューラルネットワーク(SNN)は、連続値ではなくスパイクと呼ばれる離散時間イベントを使用して情報を処理する。本稿では,メモリフットプリントが著しく低いことを特徴とする,革新的なフレームワークを提案する。我々の設計では、現在のSNNノードと比較してメモリ使用量の削減を$mathbf58.65times$で達成できる。
論文参考訳（メタデータ） (2023-08-16T19:49:24Z)
Beam Tree Recursive Cells [54.958581892688095]
本稿では,遅延構造誘導のためのビームサーチによる再帰ニューラルネットワーク(RvNN)の拡張を目的としたビームツリー再帰セル(BT-Cell)を提案する。提案したモデルは, 合成データと実データの両方において, 異なる分配分割で評価する。
論文参考訳（メタデータ） (2023-05-31T16:20:04Z)
Pruned RNN-T for fast, memory-efficient ASR training [20.646465940322763]
音声認識のためのRNN-Transducer (RNN-T) フレームワークが人気を博している。 RNN-Tの欠点の1つは、損失関数の計算が比較的遅く、多くのメモリを使用することができることである。本稿では,より高速でメモリ効率のよいRNN-T損失計算手法を提案する。
論文参考訳（メタデータ） (2022-06-23T12:18:03Z)
RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm [50.76576946099215]
小領域の損失面に局所的なミニマを反復的に求めることにより,RNNの新規かつ効率的なトレーニング手法を提案する。新たなRNNトレーニング手法を開発し,追加コストを伴っても,全体のトレーニングコストがバックプロパゲーションよりも低いことを実証的に観察した。
論文参考訳（メタデータ） (2020-10-12T01:59:18Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。