Fugu-MT 論文翻訳(概要): Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation

論文の概要: Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation

arxiv url: http://arxiv.org/abs/2201.09139v1
Date: Sat, 22 Jan 2022 22:38:15 GMT
ステータス: 翻訳完了
システム内更新日: 2022-01-25 14:34:47.303906
Title: Dual-Flattening Transformers through Decomposed Row and Column Queries for Semantic Segmentation
Title（参考訳）: セマンティクスセグメンテーションのための分割行および列クエリによるデュアルフラットトランス
Authors: Ying Wang, Chiuman Ho, Wenju Xu, Ziwei Xuan, Xudong Liu and Guo-Jun Qi
Abstract要約: 本稿では,高解像度出力を実現するためにDual-Flattening Transformer (DFlatFormer)を提案する。 ADE20KおよびCityscapesデータセットの実験は、提案された2重平坦トランスアーキテクチャの優位性を実証している。
参考スコア（独自算出の注目度）: 50.321277476317974
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: It is critical to obtain high resolution features with long range dependency for dense prediction tasks such as semantic segmentation. To generate high-resolution output of size $H\times W$ from a low-resolution feature map of size $h\times w$ ($hw\ll HW$), a naive dense transformer incurs an intractable complexity of $\mathcal{O}(hwHW)$, limiting its application on high-resolution dense prediction. We propose a Dual-Flattening Transformer (DFlatFormer) to enable high-resolution output by reducing complexity to $\mathcal{O}(hw(H+W))$ that is multiple orders of magnitude smaller than the naive dense transformer. Decomposed queries are presented to retrieve row and column attentions tractably through separate transformers, and their outputs are combined to form a dense feature map at high resolution. To this end, the input sequence fed from an encoder is row-wise and column-wise flattened to align with decomposed queries by preserving their row and column structures, respectively. Row and column transformers also interact with each other to capture their mutual attentions with the spatial crossings between rows and columns. We also propose to perform attentions through efficient grouping and pooling to further reduce the model complexity. Extensive experiments on ADE20K and Cityscapes datasets demonstrate the superiority of the proposed dual-flattening transformer architecture with higher mIoUs.
Abstract（参考訳）: セマンティクスセグメンテーションのような密集した予測タスクでは,長距離依存性を持つ高分解能特徴を得ることが重要である。 h\times w$(hw\ll hw$)の大きさの低分解能特徴マップからh\times w$の高分解能出力を生成するために、ナイーブ密閉変換器は$\mathcal{o}(hwhw)$の難解な複雑さを生じさせ、高分解能密集予測への応用を制限する。本研究では, 複雑度を$\mathcal{o}(hw(h+w))$ に下げることで高分解能出力を実現するdflatformer(dflatformer)を提案する。分割されたクエリを行と列の注意を分離した変換器で抽出し、その出力を結合して高解像度の高密度特徴写像を形成する。この目的のために、エンコーダから供給された入力シーケンスを行単位でフラット化し、行と列構造をそれぞれ保存して分解クエリと整合させる。ロウと列変換器は相互の注意を列と列の間の空間的交差で捉えるために相互に相互作用する。また,効率的なグループ化とプール化により,モデルの複雑さをさらに低減する手法を提案する。 ADE20KおよびCityscapesデータセットの大規模な実験は、より高いmIoUを持つ2重平坦トランスアーキテクチャの優位性を実証している。

関連論文リスト

HeterRec: Heterogeneous Information Transformer for Scalable Sequential Recommendation [21.435064492654494]
HeterRecは、アイテム側の異種機能を統合するシーケンシャルレコメンデーションモデルである。 HeterRecはHTFLと階層型因果変圧器層(HCT)を組み込んでいるオフラインとオンライン両方のデータセットに対する大規模な実験は、HeterRecモデルが優れたパフォーマンスを実現していることを示している。
論文参考訳（メタデータ） (2025-03-03T12:23:54Z)
MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers [43.39466934693055]
本稿では,新しい視点から計算複雑性(FLOP)を大幅に低減する,新しいトランスフォーマーアーキテクチャであるMemoryFormerを提案する。これは、完全連結層の線形射影を置き換えるために、特徴変換の代替手法を利用することで実現される。提案手法の有効性を示すため,様々なベンチマーク実験を行った。
論文参考訳（メタデータ） (2024-11-20T02:41:53Z)
Separations in the Representational Capabilities of Transformers and Recurrent Architectures [27.783705012503237]
我々は,トランスフォーマーとRNNの表現能力の違いを,実践的妥当性のいくつかのタスクで分析する。対数幅の一層変換器がインデックス検索を行うのに対し、RNNは線形サイズを隠蔽する必要があることを示す。また、ログサイズの2層トランスは、最寄りのアルゴリズムをフォワードパスで実装できることを示す。
論文参考訳（メタデータ） (2024-06-13T17:31:30Z)
RegFormer: An Efficient Projection-Aware Transformer Network for Large-Scale Point Cloud Registration [73.69415797389195]
本稿では,大規模クラウドアライメントのためのエンドツーエンドトランス (RegFormer) ネットワークを提案する。具体的には、プロジェクション対応階層変換器を提案し、長距離依存を捕捉し、外乱をフィルタする。我々の変圧器は線形複雑であり、大規模シーンでも高い効率が保証される。
論文参考訳（メタデータ） (2023-03-22T08:47:37Z)
DBA: Efficient Transformer with Dynamic Bilinear Low-Rank Attention [53.02648818164273]
動的双線形低ランク注意(DBA)という,効率的かつ効果的な注意機構を提案する。 DBAは入力感度の動的射影行列によってシーケンス長を圧縮し、線形時間と空間の複雑さを実現する。様々なシーケンス長条件のタスクに対する実験は、DBAが最先端のパフォーマンスを達成することを示す。
論文参考訳（メタデータ） (2022-11-24T03:06:36Z)
Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection [35.54153749138406]
本稿では,時間rEversed diffusioN tEnsor Transformer (TENET)を提案する。また,高階表現を備えたTransformer Relation Head (TRH) を提案し,クエリ領域とサポートセット全体の相関を符号化する。当モデルでは,PASCAL VOC,FSOD,COCOの最先端結果が得られた。
論文参考訳（メタデータ） (2022-10-30T17:40:12Z)
Diffuser: Efficient Transformers with Multi-hop Attention Diffusion for Long Sequences [16.066338004414092]
textitDiffuserはシーケンシャル・ツー・シーケンス・モデリングのための新しい効率的なトランスフォーマーである。低い計算とメモリコストを維持しながら、すべてのトークンインタラクションを1つの注意層に組み込む。スペクトルの観点からグラフ展開特性を解析することにより、全アテンションを近似する能力を示す。
論文参考訳（メタデータ） (2022-10-21T08:13:34Z)
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences [52.6022911513076]
トランスフォーマーベースのモデルは、自己アテンションモジュールの二次空間と時間的複雑さのために、長いシーケンスを処理するのに効率的ではない。我々はLinformerとInformerを提案し、低次元投影と行選択により2次複雑性を線形(モジュラー対数因子)に還元する。理論的解析に基づいて,Skeinformerを提案することにより,自己注意の促進と,自己注意への行列近似の精度の向上を図ることができる。
論文参考訳（メタデータ） (2021-12-10T06:58:05Z)
Combiner: Full Attention Transformer with Sparse Computation Cost [142.10203598824964]
計算の複雑さを低く保ちつつ、各注目ヘッドにフルアテンション機能を提供するコンバインダを提案する。既存のスパース変圧器で使用されるスパースアテンションパターンのほとんどは、そのような分解設計をフルアテンションに刺激することができることを示す。自己回帰的タスクと双方向シーケンスタスクの両方に関する実験的評価は、このアプローチの有効性を示す。
論文参考訳（メタデータ） (2021-07-12T22:43:11Z)
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding [90.77031668988661]
Cluster-Formerはクラスタリングベースの新しいスパーストランスであり、チャンクされたシーケンスにまたがって注意を向ける。提案されたフレームワークは、Sliding-Window LayerとCluster-Former Layerの2つのユニークなタイプのTransformer Layerにピボットされている。実験によると、Cluster-Formerはいくつかの主要なQAベンチマークで最先端のパフォーマンスを達成する。
論文参考訳（メタデータ） (2020-09-13T22:09:30Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。