Efficient Visual Recognition with Deep Neural Networks: A Survey on
Recent Advances and New Directions
- URL: http://arxiv.org/abs/2108.13055v1
- Date: Mon, 30 Aug 2021 08:19:34 GMT
- Title: Efficient Visual Recognition with Deep Neural Networks: A Survey on
Recent Advances and New Directions
- Authors: Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng
Dong, Jianbo Shi
- Abstract summary: Deep neural networks (DNNs) have largely boosted their performances on many concrete tasks.
Deep neural networks (DNNs) have largely boosted their performances on many concrete tasks.
This paper presents the review of the recent advances with our suggestions on the new possible directions.
- Score: 37.914102870280324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual recognition is currently one of the most important and active research
areas in computer vision, pattern recognition, and even the general field of
artificial intelligence. It has great fundamental importance and strong
industrial needs. Deep neural networks (DNNs) have largely boosted their
performances on many concrete tasks, with the help of large amounts of training
data and new powerful computation resources. Though recognition accuracy is
usually the first concern for new progresses, efficiency is actually rather
important and sometimes critical for both academic research and industrial
applications. Moreover, insightful views on the opportunities and challenges of
efficiency are also highly required for the entire community. While general
surveys on the efficiency issue of DNNs have been done from various
perspectives, as far as we are aware, scarcely any of them focused on visual
recognition systematically, and thus it is unclear which progresses are
applicable to it and what else should be concerned. In this paper, we present
the review of the recent advances with our suggestions on the new possible
directions towards improving the efficiency of DNN-related visual recognition
approaches. We investigate not only from the model but also the data point of
view (which is not the case in existing surveys), and focus on three most
studied data types (images, videos and points). This paper attempts to provide
a systematic summary via a comprehensive survey which can serve as a valuable
reference and inspire both researchers and practitioners who work on visual
recognition problems.
Related papers
- A Critical Analysis on Machine Learning Techniques for Video-based Human Activity Recognition of Surveillance Systems: A Review [1.3693860189056777]
Upsurging abnormal activities in crowded locations urges the necessity for an intelligent surveillance system.
Video-based human activity recognition has intrigued many researchers with its pressing issues.
This paper provides a critical survey of video-based Human Activity Recognition (HAR) techniques.
arXiv Detail & Related papers (2024-09-01T14:43:57Z) - RNNs, CNNs and Transformers in Human Action Recognition: A Survey and a Hybrid Model [0.0]
Human Action Recognition (HAR) encompasses the task of monitoring human activities across various domains.
Over the past decade, the field of HAR has witnessed substantial progress by leveraging Convolutional Neural Networks (CNNs)
Recently, the domain of computer vision has witnessed the emergence of Vision Transformers (ViTs) as a potent solution.
arXiv Detail & Related papers (2024-06-02T17:09:59Z) - Effectiveness Assessment of Recent Large Vision-Language Models [78.69439393646554]
This paper endeavors to evaluate the competency of popular large vision-language models (LVLMs) in specialized and general tasks.
We employ six challenging tasks in three different application scenarios: natural, healthcare, and industrial.
We examine the performance of three recent open-source LVLMs, including MiniGPT-v2, LLaVA-1.5, and Shikra, on both visual recognition and localization in these tasks.
arXiv Detail & Related papers (2024-03-07T08:25:27Z) - Machine Unlearning: A Survey [56.79152190680552]
A special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning.
This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality.
No study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios.
The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities.
arXiv Detail & Related papers (2023-06-06T10:18:36Z) - Survey: Exploiting Data Redundancy for Optimization of Deep Learning [42.1585031880029]
Data redundancy is ubiquitous in the inputs and intermediate results of Deep Neural Networks (DNN)
This article surveys hundreds of recent papers on the topic.
It introduces a novel taxonomy to put the various techniques into a single categorization framework.
arXiv Detail & Related papers (2022-08-29T04:31:18Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Human Activity Recognition Using Tools of Convolutional Neural Networks:
A State of the Art Review, Data Sets, Challenges and Future Prospects [7.275302131211702]
This review is to summarize recent works based on a wide range of deep neural networks architecture, namely convolutional neural networks (CNNs) for human activity recognition.
The reviewed systems are clustered into four categories depending on the use of input devices like multimodal sensing devices, smartphones, radar, and vision devices.
arXiv Detail & Related papers (2022-02-02T18:52:13Z) - Affect Analysis in-the-wild: Valence-Arousal, Expressions, Action Units
and a Unified Framework [83.21732533130846]
The paper focuses on large in-the-wild databases, i.e., Aff-Wild and Aff-Wild2.
It presents the design of two classes of deep neural networks trained with these databases.
A novel multi-task and holistic framework is presented which is able to jointly learn and effectively generalize and perform affect recognition.
arXiv Detail & Related papers (2021-03-29T17:36:20Z) - Survey on Reliable Deep Learning-Based Person Re-Identification Models:
Are We There Yet? [19.23187114221822]
Person re-identification (PReID) is one of the most critical problems in intelligent video-surveillance (IVS)
Deep neural networks (DNNs) given their compelling performance on similar vision problems and fast execution at test time.
We present descriptions of each model along with their evaluation on a set of benchmark datasets.
arXiv Detail & Related papers (2020-04-30T16:09:16Z) - Deep Learning for Sensor-based Human Activity Recognition: Overview,
Challenges and Opportunities [52.59080024266596]
We present a survey of the state-of-the-art deep learning methods for sensor-based human activity recognition.
We first introduce the multi-modality of the sensory data and provide information for public datasets.
We then propose a new taxonomy to structure the deep methods by challenges.
arXiv Detail & Related papers (2020-01-21T09:55:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.