Attention mechanisms and deep learning for machine vision: A survey of
the state of the art
- URL: http://arxiv.org/abs/2106.07550v1
- Date: Thu, 3 Jun 2021 10:23:32 GMT
- Title: Attention mechanisms and deep learning for machine vision: A survey of
the state of the art
- Authors: Abdul Mueed Hafiz, Shabir Ahmad Parah, Rouf Ul Alam Bhat
- Abstract summary: Vision transformers (ViTs) are giving quite a challenge to the established deep learning based machine vision techniques.
Some recent works suggest that combinations of these two varied fields can prove to build systems which have the advantages of both these fields.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the advent of state of the art nature-inspired pure attention based
models i.e. transformers, and their success in natural language processing
(NLP), their extension to machine vision (MV) tasks was inevitable and much
felt. Subsequently, vision transformers (ViTs) were introduced which are giving
quite a challenge to the established deep learning based machine vision
techniques. However, pure attention based models/architectures like
transformers require huge data, large training times and large computational
resources. Some recent works suggest that combinations of these two varied
fields can prove to build systems which have the advantages of both these
fields. Accordingly, this state of the art survey paper is introduced which
hopefully will help readers get useful information about this interesting and
potential research area. A gentle introduction to attention mechanisms is
given, followed by a discussion of the popular attention based deep
architectures. Subsequently, the major categories of the intersection of
attention mechanisms and deep learning for machine vision (MV) based are
discussed. Afterwards, the major algorithms, issues and trends within the scope
of the paper are discussed.
Related papers
- Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights [5.798431829723857]
This paper provides a comprehensive exploration of techniques and insights for designing attention mechanisms in Vision Transformer (ViT) networks.
We present a systematic taxonomy of various attention mechanisms within ViTs, employing redesigned approaches.
The analysis includes an exploration of the novelty, strengths, weaknesses, and an in-depth evaluation of the different proposed strategies.
arXiv Detail & Related papers (2024-03-28T23:31:59Z) - Integration and Performance Analysis of Artificial Intelligence and
Computer Vision Based on Deep Learning Algorithms [5.734290974917728]
This paper focuses on the analysis of the application effectiveness of the integration of deep learning and computer vision technologies.
Deep learning achieves a historic breakthrough by constructing hierarchical neural networks, enabling end-to-end feature learning and semantic understanding of images.
The successful experiences in the field of computer vision provide strong support for training deep learning algorithms.
arXiv Detail & Related papers (2023-12-20T09:37:06Z) - Neural architecture impact on identifying temporally extended
Reinforcement Learning tasks [0.0]
We present Attention based architectures in reinforcement learning (RL) domain, capable of performing well on OpenAI Gym Atari- 2600 game suite.
In Attention based models, extracting and overlaying of attention map onto images allows for direct observation of information used by agent to select actions.
In addition, motivated by recent developments in attention based video-classification models using Vision Transformer, we come up with an architecture based on Vision Transformer, for image-based RL domain too.
arXiv Detail & Related papers (2023-10-04T21:09:19Z) - Review of Large Vision Models and Visual Prompt Engineering [50.63394642549947]
Review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering.
We present influential large models in the visual domain and a range of prompt engineering methods employed on these models.
arXiv Detail & Related papers (2023-07-03T08:48:49Z) - AttentionViz: A Global View of Transformer Attention [60.82904477362676]
We present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers.
The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention.
We create an interactive visualization tool, AttentionViz, based on these joint query-key embeddings.
arXiv Detail & Related papers (2023-05-04T23:46:49Z) - Deep Learning to See: Towards New Foundations of Computer Vision [88.69805848302266]
This book criticizes the supposed scientific progress in the field of computer vision.
It proposes the investigation of vision within the framework of information-based laws of nature.
arXiv Detail & Related papers (2022-06-30T15:20:36Z) - Attention Mechanisms in Computer Vision: A Survey [75.6074182122423]
We provide a comprehensive review of various attention mechanisms in computer vision.
We categorize them according to approach, such as channel attention, spatial attention, temporal attention and branch attention.
We suggest future directions for attention mechanism research.
arXiv Detail & Related papers (2021-11-15T09:18:40Z) - Can machines learn to see without visual databases? [93.73109506642112]
This paper focuses on developing machines that learn to see without needing to handle visual databases.
This might open the doors to a truly competitive track concerning deep learning technologies for vision.
arXiv Detail & Related papers (2021-10-12T13:03:54Z) - Threat of Adversarial Attacks on Deep Learning in Computer Vision:
Survey II [86.51135909513047]
Deep Learning is vulnerable to adversarial attacks that can manipulate its predictions.
This article reviews the contributions made by the computer vision community in adversarial attacks on deep learning.
It provides definitions of technical terminologies for non-experts in this domain.
arXiv Detail & Related papers (2021-08-01T08:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.