An evaluation of CNN models and data augmentation techniques in hierarchical localization of mobile robots
- URL: http://arxiv.org/abs/2407.10596v1
- Date: Mon, 15 Jul 2024 10:20:00 GMT
- Title: An evaluation of CNN models and data augmentation techniques in hierarchical localization of mobile robots
- Authors: J. J. Cabrera, O. J. Céspedes, S. Cebollada, O. Reinoso, L. Payá,
- Abstract summary: This work presents an evaluation of CNN models and data augmentation to carry out the hierarchical localization of a mobile robot.
In this sense, an ablation study of different state-of-the-art CNN models used as backbone is presented.
A variety of data augmentation visual effects are proposed for addressing the visual localization of the robot.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work presents an evaluation of CNN models and data augmentation to carry out the hierarchical localization of a mobile robot by using omnidireccional images. In this sense, an ablation study of different state-of-the-art CNN models used as backbone is presented and a variety of data augmentation visual effects are proposed for addressing the visual localization of the robot. The proposed method is based on the adaption and re-training of a CNN with a dual purpose: (1) to perform a rough localization step in which the model is used to predict the room from which an image was captured, and (2) to address the fine localization step, which consists in retrieving the most similar image of the visual map among those contained in the previously predicted room by means of a pairwise comparison between descriptors obtained from an intermediate layer of the CNN. In this sense, we evaluate the impact of different state-of-the-art CNN models such as ConvNeXt for addressing the proposed localization. Finally, a variety of data augmentation visual effects are separately employed for training the model and their impact is assessed. The performance of the resulting CNNs is evaluated under real operation conditions, including changes in the lighting conditions. Our code is publicly available on the project website https://github.com/juanjo-cabrera/IndoorLocalizationSingleCNN.git
Related papers
- Model Parallel Training and Transfer Learning for Convolutional Neural Networks by Domain Decomposition [0.0]
Deep convolutional neural networks (CNNs) have been shown to be very successful in a wide range of image processing applications.
Due to their increasing number of model parameters and an increasing availability of large amounts of training data, parallelization strategies to efficiently train complex CNNs are necessary.
arXiv Detail & Related papers (2024-08-26T17:35:01Z) - An Explainable Model-Agnostic Algorithm for CNN-based Biometrics
Verification [55.28171619580959]
This paper describes an adaptation of the Local Interpretable Model-Agnostic Explanations (LIME) AI method to operate under a biometric verification setting.
arXiv Detail & Related papers (2023-07-25T11:51:14Z) - Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition.
Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models.
Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - A Novel Hand Gesture Detection and Recognition system based on
ensemble-based Convolutional Neural Network [3.5665681694253903]
Detection of hand portion has become a challenging task in computer vision and pattern recognition communities.
Deep learning algorithm like convolutional neural network (CNN) architecture has become a very popular choice for classification tasks.
In this paper, an ensemble of CNN-based approaches is presented to overcome some problems like high variance during prediction, overfitting problem and also prediction errors.
arXiv Detail & Related papers (2022-02-25T06:46:58Z) - Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.
Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement.
In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Video-based Facial Expression Recognition using Graph Convolutional
Networks [57.980827038988735]
We introduce a Graph Convolutional Network (GCN) layer into a common CNN-RNN based model for video-based facial expression recognition.
We evaluate our method on three widely-used datasets, CK+, Oulu-CASIA and MMI, and also one challenging wild dataset AFEW8.0.
arXiv Detail & Related papers (2020-10-26T07:31:51Z) - Exploring the Interchangeability of CNN Embedding Spaces [0.5735035463793008]
We map between 10 image-classification CNNs and between 4 facial-recognition CNNs.
For CNNs trained to the same classes and sharing a common backend-logit architecture, a linear-mapping may always be calculated directly from the backend layer weights.
The implications are far-reaching, suggesting an underlying commonality between representations learned by networks designed and trained for a common task.
arXiv Detail & Related papers (2020-10-05T20:32:40Z) - Homography Estimation with Convolutional Neural Networks Under
Conditions of Variance [0.0]
We analyze the performance of two recently published methods using Convolutional Neural Networks (CNNs)
CNNs can be trained to be more robust against noise, but at a small cost to accuracy in the noiseless case.
We show that training a CNN to a specific magnitude of noise leads to a "Goldilocks Zone" with regard to the noise levels where that CNN performs best.
arXiv Detail & Related papers (2020-10-02T15:11:25Z) - Decoding CNN based Object Classifier Using Visualization [6.666597301197889]
We visualize what type of features are extracted in different convolution layers of CNN.
Visualizing heat map of activation helps us to understand how CNN classifies and localizes different objects in image.
arXiv Detail & Related papers (2020-07-15T05:01:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.