Related papers: Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey

Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey

URL: http://arxiv.org/abs/2108.11510v1
Date: Wed, 25 Aug 2021 23:01:48 GMT
Title: Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey
Authors: Ngan Le, Vidhiwar Singh Rathour, Kashu Yamazaki, Khoa Luu, Marios Savvides
Abstract summary: Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision.
Score: 29.309914600633032
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i)landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision

Related papers

A Decade of Deep Learning: A Survey on The Magnificent Seven [19.444198085817543]
Deep learning has fundamentally reshaped the landscape of artificial intelligence over the past decade. We present a comprehensive overview of the most influential deep learning algorithms selected through a broad-based survey of the field. Our discussion centers on pivotal architectures, including Residual Networks, Transformers, Generative Adversarial Networks, Variational Autoencoders, Graph Neural Networks, Contrastive Language-Image Pre-training, and Diffusion models.
arXiv Detail & Related papers (2024-12-13T17:55:39Z)
A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research. Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z)
Integration and Performance Analysis of Artificial Intelligence and Computer Vision Based on Deep Learning Algorithms [5.734290974917728]
This paper focuses on the analysis of the application effectiveness of the integration of deep learning and computer vision technologies. Deep learning achieves a historic breakthrough by constructing hierarchical neural networks, enabling end-to-end feature learning and semantic understanding of images. The successful experiences in the field of computer vision provide strong support for training deep learning algorithms.
arXiv Detail & Related papers (2023-12-20T09:37:06Z)
Deep Neural Networks in Video Human Action Recognition: A Review [21.00217656391331]
Video behavior recognition is one of the most foundational tasks of computer vision. Deep neural networks are built for recognizing pixel-level information such as images with RGB, RGB-D, or optical flow formats. In our article, the performance of deep neural networks surpassed most of the techniques in the feature learning and extraction tasks.
arXiv Detail & Related papers (2023-05-25T03:54:41Z)
Hyperbolic Deep Learning in Computer Vision: A Survey [20.811974050049365]
hyperbolic space has gained rapid traction for learning in computer vision. We provide a categorization and in-depth overview of current literature on hyperbolic learning for computer vision. We outline how hyperbolic learning is performed in all themes and discuss the main research problems that benefit from current advances in hyperbolic learning for computer vision.
arXiv Detail & Related papers (2023-05-11T07:14:23Z)
Deep Learning to See: Towards New Foundations of Computer Vision [88.69805848302266]
This book criticizes the supposed scientific progress in the field of computer vision. It proposes the investigation of vision within the framework of information-based laws of nature.
arXiv Detail & Related papers (2022-06-30T15:20:36Z)
Deep Learning for Visual Speech Analysis: A Survey [54.53032361204449]
This paper presents a review of recent progress in deep learning methods on visual speech analysis. We cover different aspects of visual speech, including fundamental problems, challenges, benchmark datasets, a taxonomy of existing methods, and state-of-the-art performance.
arXiv Detail & Related papers (2022-05-22T14:44:53Z)
Tensor Methods in Computer Vision and Deep Learning [120.3881619902096]
tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions. With the advent of the deep learning paradigm shift in computer vision, tensors have become even more fundamental. This article provides an in-depth and practical review of tensors and tensor methods in the context of representation learning and deep learning.
arXiv Detail & Related papers (2021-07-07T18:42:45Z)
D2RL: Deep Dense Architectures in Reinforcement Learning [47.67475810050311]
We take inspiration from successful architectural choices in computer vision and generative modelling. We investigate the use of deeper networks and dense connections for reinforcement learning on a variety of simulated robotic learning benchmark environments.
arXiv Detail & Related papers (2020-10-19T01:27:07Z)
Distilled Semantics for Comprehensive Scene Understanding from Videos [53.49501208503774]
In this paper, we take an additional step toward holistic scene understanding with monocular cameras by learning depth and motion alongside with semantics. We address the three tasks jointly by a novel training protocol based on knowledge distillation and self-supervision. We show that it yields state-of-the-art results for monocular depth estimation, optical flow and motion segmentation.
arXiv Detail & Related papers (2020-03-31T08:52:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.