Deep Learning to See: Towards New Foundations of Computer Vision
- URL: http://arxiv.org/abs/2206.15351v1
- Date: Thu, 30 Jun 2022 15:20:36 GMT
- Title: Deep Learning to See: Towards New Foundations of Computer Vision
- Authors: Alessandro Betti, Marco Gori, Stefano Melacci
- Abstract summary: This book criticizes the supposed scientific progress in the field of computer vision.
It proposes the investigation of vision within the framework of information-based laws of nature.
- Score: 88.69805848302266
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The remarkable progress in computer vision over the last few years is, by and
large, attributed to deep learning, fueled by the availability of huge sets of
labeled data, and paired with the explosive growth of the GPU paradigm. While
subscribing to this view, this book criticizes the supposed scientific progress
in the field and proposes the investigation of vision within the framework of
information-based laws of nature. Specifically, the present work poses
fundamental questions about vision that remain far from understood, leading the
reader on a journey populated by novel challenges resonating with the
foundations of machine learning. The central thesis is that for a deeper
understanding of visual computational processes, it is necessary to look beyond
the applications of general purpose machine learning algorithms and focus
instead on appropriate learning theories that take into account the
spatiotemporal nature of the visual signal.
Related papers
- Visual Knowledge in the Big Model Era: Retrospect and Prospect [63.282425615863]
Visual knowledge is a new form of knowledge representation that can encapsulate visual concepts and their relations in a succinct, comprehensive, and interpretable manner.
As the knowledge about the visual world has been identified as an indispensable component of human cognition and intelligence, visual knowledge is poised to have a pivotal role in establishing machine intelligence.
arXiv Detail & Related papers (2024-04-05T07:31:24Z) - Integration and Performance Analysis of Artificial Intelligence and
Computer Vision Based on Deep Learning Algorithms [5.734290974917728]
This paper focuses on the analysis of the application effectiveness of the integration of deep learning and computer vision technologies.
Deep learning achieves a historic breakthrough by constructing hierarchical neural networks, enabling end-to-end feature learning and semantic understanding of images.
The successful experiences in the field of computer vision provide strong support for training deep learning algorithms.
arXiv Detail & Related papers (2023-12-20T09:37:06Z) - Physics-Informed Computer Vision: A Review and Perspectives [22.71741766133866]
incorporation of physical information in machine learning frameworks is opening and transforming many application domains.
We present a systematic literature review of more than 250 papers on formulation and approaches to computer vision tasks guided by physical laws.
arXiv Detail & Related papers (2023-05-29T11:55:11Z) - Hyperbolic Deep Learning in Computer Vision: A Survey [20.811974050049365]
hyperbolic space has gained rapid traction for learning in computer vision.
We provide a categorization and in-depth overview of current literature on hyperbolic learning for computer vision.
We outline how hyperbolic learning is performed in all themes and discuss the main research problems that benefit from current advances in hyperbolic learning for computer vision.
arXiv Detail & Related papers (2023-05-11T07:14:23Z) - VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and
Challenges [1.565870461096057]
The integration of vision and language has sparked a lot of attention as a result of this.
The tasks have been created in such a way that they properly exemplify the concepts of deep learning.
arXiv Detail & Related papers (2022-12-26T20:56:01Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Visual Sensation and Perception Computational Models for Deep Learning:
State of the art, Challenges and Prospects [7.949330621850412]
visual sensation and perception refers to the process of sensing, organizing, identifying, and interpreting visual information in environmental awareness and understanding.
Computational models inspired by visual perception have the characteristics of complexity and diversity, as they come from many subjects such as cognition science, information science, and artificial intelligence.
arXiv Detail & Related papers (2021-09-08T01:51:24Z) - Threat of Adversarial Attacks on Deep Learning in Computer Vision:
Survey II [86.51135909513047]
Deep Learning is vulnerable to adversarial attacks that can manipulate its predictions.
This article reviews the contributions made by the computer vision community in adversarial attacks on deep learning.
It provides definitions of technical terminologies for non-experts in this domain.
arXiv Detail & Related papers (2021-08-01T08:54:47Z) - Tensor Methods in Computer Vision and Deep Learning [120.3881619902096]
tensors, or multidimensional arrays, are data structures that can naturally represent visual data of multiple dimensions.
With the advent of the deep learning paradigm shift in computer vision, tensors have become even more fundamental.
This article provides an in-depth and practical review of tensors and tensor methods in the context of representation learning and deep learning.
arXiv Detail & Related papers (2021-07-07T18:42:45Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.