Deep Learning Model with GA based Feature Selection and Context
Integration
- URL: http://arxiv.org/abs/2204.06189v1
- Date: Wed, 13 Apr 2022 06:28:41 GMT
- Title: Deep Learning Model with GA based Feature Selection and Context
Integration
- Authors: Ranju Mandal, Basim Azam, Brijesh Verma, Mengjie Zhang
- Abstract summary: We propose a novel three-layered deep learning model that assiminlates or learns independently global and local contextual information alongside visual features.
The novelty of the proposed model is that One-vs-All binary class-based learners are introduced to learn Genetic Algorithm (GA) optimized features in the visual layer.
optimized visual features with global and local contextual information play a significant role to improve accuracy and produce stable predictions comparable to state-of-the-art deep CNN models.
- Score: 2.3472688456025756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models have been very successful in computer vision and image
processing applications. Since its inception, Many top-performing methods for
image segmentation are based on deep CNN models. However, deep CNN models fail
to integrate global and local context alongside visual features despite having
complex multi-layer architectures. We propose a novel three-layered deep
learning model that assiminlate or learns independently global and local
contextual information alongside visual features. The novelty of the proposed
model is that One-vs-All binary class-based learners are introduced to learn
Genetic Algorithm (GA) optimized features in the visual layer, followed by the
contextual layer that learns global and local contexts of an image, and finally
the third layer integrates all the information optimally to obtain the final
class label. Stanford Background and CamVid benchmark image parsing datasets
were used for our model evaluation, and our model shows promising results. The
empirical analysis reveals that optimized visual features with global and local
contextual information play a significant role to improve accuracy and produce
stable predictions comparable to state-of-the-art deep CNN models.
Related papers
- Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection.
CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z) - DeepSeek-VL: Towards Real-World Vision-Language Understanding [24.57011093316788]
We present DeepSeek-VL, an open-source Vision-Language (VL) Model for real-world vision and language understanding applications.
Our approach is structured around three key dimensions: We strive to ensure our data is diverse, scalable, and extensively covers real-world scenarios.
We create a use case taxonomy from real user scenarios and construct an instruction tuning dataset.
arXiv Detail & Related papers (2024-03-08T18:46:00Z) - Multi-network Contrastive Learning Based on Global and Local
Representations [4.190134425277768]
This paper proposes a multi-network contrastive learning framework based on global and local representations.
We introduce global and local feature information for self-supervised contrastive learning through multiple networks.
The framework also expands the number of samples used for contrast and improves the training efficiency of the model.
arXiv Detail & Related papers (2023-06-28T05:30:57Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - Learning Customized Visual Models with Retrieval-Augmented Knowledge [104.05456849611895]
We propose REACT, a framework to acquire the relevant web knowledge to build customized visual models for target domains.
We retrieve the most relevant image-text pairs from the web-scale database as external knowledge, and propose to customize the model by only training new modualized blocks while freezing all the original weights.
The effectiveness of REACT is demonstrated via extensive experiments on classification, retrieval, detection and segmentation tasks, including zero, few, and full-shot settings.
arXiv Detail & Related papers (2023-01-17T18:59:06Z) - Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images.
We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM)
The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z) - Context-based Deep Learning Architecture with Optimal Integration Layer
for Image Parsing [0.0]
The proposed three-layer context-based deep architecture is capable of integrating context explicitly with visual information.
The experimental outcomes when evaluated on benchmark datasets are promising.
arXiv Detail & Related papers (2022-04-13T07:35:39Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models.
The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z) - Unifying Deep Local and Global Features for Image Search [9.614694312155798]
We unify global and local image features into a single deep model, enabling accurate retrieval with efficient feature extraction.
Our model achieves state-of-the-art image retrieval on the Revisited Oxford and Paris datasets, and state-of-the-art single-model instance-level recognition on the Google Landmarks dataset v2.
arXiv Detail & Related papers (2020-01-14T19:59:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.