Backbones-Review: Feature Extraction Networks for Deep Learning and Deep
Reinforcement Learning Approaches
- URL: http://arxiv.org/abs/2206.08016v1
- Date: Thu, 16 Jun 2022 09:18:34 GMT
- Title: Backbones-Review: Feature Extraction Networks for Deep Learning and Deep
Reinforcement Learning Approaches
- Authors: Omar Elharroussad, Younes Akbari, Noor Almaadeed, Somaya Al-Maadeed
- Abstract summary: CNNs allow to work on large-scale size of data, as well as cover different scenarios for a specific task.
Many networks have been proposed and become the famous networks used for any DL models in any AI task.
A backbone is a known network trained in many other tasks before and demonstrates its effectiveness.
- Score: 3.255610188565679
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: To understand the real world using various types of data, Artificial
Intelligence (AI) is the most used technique nowadays. While finding the
pattern within the analyzed data represents the main task. This is performed by
extracting representative features step, which is proceeded using the
statistical algorithms or using some specific filters. However, the selection
of useful features from large-scale data represented a crucial challenge. Now,
with the development of convolution neural networks (CNNs), the feature
extraction operation has become more automatic and easier. CNNs allow to work
on large-scale size of data, as well as cover different scenarios for a
specific task. For computer vision tasks, convolutional networks are used to
extract features also for the other parts of a deep learning model. The
selection of a suitable network for feature extraction or the other parts of a
DL model is not random work. So, the implementation of such a model can be
related to the target task as well as the computational complexity of it. Many
networks have been proposed and become the famous networks used for any DL
models in any AI task. These networks are exploited for feature extraction or
at the beginning of any DL model which is named backbones. A backbone is a
known network trained in many other tasks before and demonstrates its
effectiveness. In this paper, an overview of the existing backbones, e.g. VGGs,
ResNets, DenseNet, etc, is given with a detailed description. Also, a couple of
computer vision tasks are discussed by providing a review of each task
regarding the backbones used. In addition, a comparison in terms of performance
is also provided, based on the backbone used for each task.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - X-Learner: Learning Cross Sources and Tasks for Universal Visual
Representation [71.51719469058666]
We propose a representation learning framework called X-Learner.
X-Learner learns the universal feature of multiple vision tasks supervised by various sources.
X-Learner achieves strong performance on different tasks without extra annotations, modalities and computational costs.
arXiv Detail & Related papers (2022-03-16T17:23:26Z) - Learning Purified Feature Representations from Task-irrelevant Labels [18.967445416679624]
We propose a novel learning framework called PurifiedLearning to exploit task-irrelevant features extracted from task-irrelevant labels.
Our work is built on solid theoretical analysis and extensive experiments, which demonstrate the effectiveness of PurifiedLearning.
arXiv Detail & Related papers (2021-02-22T12:50:49Z) - Self-Supervision based Task-Specific Image Collection Summarization [3.115375810642661]
We propose a novel approach to task-specific image corpus summarization using semantic information and self-supervision.
Our method uses a classification-based Wasserstein generative adversarial network (WGAN) as a feature generating network.
The model then generates a summary at inference time by using K-means clustering in the semantic embedding space.
arXiv Detail & Related papers (2020-12-19T10:58:04Z) - Graph-Based Neural Network Models with Multiple Self-Supervised
Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points.
We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z) - Emotion Recognition on large video dataset based on Convolutional
Feature Extractor and Recurrent Neural Network [0.2855485723554975]
Our model combines convolutional neural network (CNN) with recurrent neural network (RNN) to predict dimensional emotions on video data.
Experiments are performed on publicly available datasets including the largest modern Aff-Wild2 database.
arXiv Detail & Related papers (2020-06-19T14:54:13Z) - Deep Multi-Task Augmented Feature Learning via Hierarchical Graph Neural
Network [4.121467410954028]
We propose a Hierarchical Graph Neural Network to learn augmented features for deep multi-task learning.
Experiments on real-world datastes show the significant performance improvement when using this strategy.
arXiv Detail & Related papers (2020-02-12T06:02:20Z) - NeurAll: Towards a Unified Visual Perception Model for Automated Driving [8.49826472556323]
We propose a joint multi-task network design for learning several tasks simultaneously.
The main bottleneck in automated driving systems is the limited processing power available on deployment hardware.
arXiv Detail & Related papers (2019-02-10T12:45:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.