Fugu-MT 論文翻訳(概要): Visual Attention Methods in Deep Learning: An In-Depth Survey

論文の概要: Visual Attention Methods in Deep Learning: An In-Depth Survey

arxiv url: http://arxiv.org/abs/2204.07756v3
Date: Sun, 5 May 2024 18:44:14 GMT
ステータス: 翻訳完了
システム内更新日: 2024-05-08 03:49:02.128527
Title: Visual Attention Methods in Deep Learning: An In-Depth Survey
Title（参考訳）: 深層学習における視覚的注意方法:深部調査
Authors: Mohammed Hassanin, Saeed Anwar, Ibrahim Radwan, Fahad S Khan, Ajmal Mian,
Abstract要約: 人間の認知システムにインスパイアされた注意は、特定の情報に対する人間の認知意識を模倣するメカニズムである。ディープラーニングは多くのアプリケーションのパフォーマンス向上に注意を払っています。この文献は、深層モデルに注意を向ける研究者を導くための注意技術に関する包括的な調査を欠いている。
参考スコア（独自算出の注目度）: 37.18104595529633
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Inspired by the human cognitive system, attention is a mechanism that imitates the human cognitive awareness about specific information, amplifying critical details to focus more on the essential aspects of data. Deep learning has employed attention to boost performance for many applications. Interestingly, the same attention design can suit processing different data modalities and can easily be incorporated into large networks. Furthermore, multiple complementary attention mechanisms can be incorporated into one network. Hence, attention techniques have become extremely attractive. However, the literature lacks a comprehensive survey on attention techniques to guide researchers in employing attention in their deep models. Note that, besides being demanding in terms of training data and computational resources, transformers only cover a single category in self-attention out of the many categories available. We fill this gap and provide an in-depth survey of 50 attention techniques, categorizing them by their most prominent features. We initiate our discussion by introducing the fundamental concepts behind the success of the attention mechanism. Next, we furnish some essentials such as the strengths and limitations of each attention category, describe their fundamental building blocks, basic formulations with primary usage, and applications specifically for computer vision. We also discuss the challenges and general open questions related to attention mechanisms. Finally, we recommend possible future research directions for deep attention. All the information about visual attention methods in deep learning is provided at \href{https://github.com/saeed-anwar/VisualAttention}{https://github.com/saeed-anwar/VisualAttention}
Abstract（参考訳）: 人間の認知システムにインスパイアされた注意は、特定の情報に対する人間の認知意識を模倣し、重要な詳細を増幅し、データの本質的な側面に焦点を当てるメカニズムである。ディープラーニングは多くのアプリケーションのパフォーマンス向上に注意を払っています。興味深いことに、同じアテンション設計は異なるデータモダリティを処理するのに適しており、簡単に大きなネットワークに組み込むことができる。さらに、複数の補完的注意機構を1つのネットワークに組み込むことができる。そのため、注目の技は極めて魅力的になってきている。しかし、この文献は、深層モデルに注意を向ける研究者を導くための注意技術に関する包括的な調査を欠いている。注意すべき点は、トレーニングデータと計算資源の面で要求されていることに加えて、トランスフォーマーは利用可能な多くのカテゴリのうち、単一のカテゴリのみを自己注意でカバーしていることだ。このギャップを埋め、最も顕著な特徴によって分類し、50の注意技法の詳細な調査を行う。注意機構の成功の背景にある基本的な概念を導入することで議論を始める。次に,各注目カテゴリーの強みや限界,基本的構成要素,一次利用による基本的定式化,特にコンピュータビジョンへの応用について述べる。また、注意機構に関する課題や一般のオープンな質問についても論じる。最後に,今後の研究の方向性を深く検討することを推奨する。ディープラーニングにおける視覚的注意法に関するすべての情報は、 \href{https://github.com/saeed-anwar/VisualAttention}{https://github.com/saeed-anwar/VisualAttention} で提供されている。

論文の概要: Visual Attention Methods in Deep Learning: An In-Depth Survey

関連論文リスト