Related papers: CBIR using Pre-Trained Neural Networks

CBIR using Pre-Trained Neural Networks

URL: http://arxiv.org/abs/2110.14455v1
Date: Wed, 27 Oct 2021 14:19:48 GMT
Title: CBIR using Pre-Trained Neural Networks
Authors: Agnel Lazar Alappat, Prajwal Nakhate, Sagar Suman, Ambarish Chandurkar, Varad Pimpalkhute, Tapan Jain
Abstract summary: We use a pretrained Inception V3 model, and extract activation of its last fully connected layer, which forms a low dimensional representation of the image. This feature matrix, is then divided into branches and separate feature extraction is done for each branch, to obtain multiple features flattened into a vector. We achieved a training accuracy of 99.46% and validation accuracy of 84.56% for the same.
Score: 1.2130044156459308
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Much of the recent research work in image retrieval, has been focused around using Neural Networks as the core component. Many of the papers in other domain have shown that training multiple models, and then combining their outcomes, provide good results. This is since, a single Neural Network model, may not extract sufficient information from the input. In this paper, we aim to follow a different approach. Instead of the using a single model, we use a pretrained Inception V3 model, and extract activation of its last fully connected layer, which forms a low dimensional representation of the image. This feature matrix, is then divided into branches and separate feature extraction is done for each branch, to obtain multiple features flattened into a vector. Such individual vectors are then combined, to get a single combined feature. We make use of CUB200-2011 Dataset, which comprises of 200 birds classes to train the model on. We achieved a training accuracy of 99.46% and validation accuracy of 84.56% for the same. On further use of 3 branched global descriptors, we improve the validation accuracy to 88.89%. For this, we made use of MS-RMAC feature extraction method.

Related papers

BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND) Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model. Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z)
An Empirical Study of Multimodal Model Merging [148.48412442848795]
Model merging is a technique that fuses multiple models trained on different tasks to generate a multi-task solution. We conduct our study for a novel goal where we can merge vision, language, and cross-modal transformers of a modality-specific architecture. We propose two metrics that assess the distance between weights to be merged and can serve as an indicator of the merging outcomes.
arXiv Detail & Related papers (2023-04-28T15:43:21Z)
CNN-transformer mixed model for object detection [3.5897534810405403]
In this paper, I propose a convolutional module with a transformer. It aims to improve the recognition accuracy of the model by fusing the detailed features extracted by CNN with the global features extracted by a transformer. After 100 rounds of training on the Pascal VOC dataset, the accuracy of the results reached 81%, which is 4.6 better than the faster RCNN[4] using resnet101[5] as the backbone.
arXiv Detail & Related papers (2022-12-13T16:35:35Z)
Co-training $2^L$ Submodels for Visual Recognition [67.02999567435626]
Submodel co-training is a regularization method related to co-training, self-distillation and depth. We show that submodel co-training is effective to train backbones for recognition tasks such as image classification and semantic segmentation.
arXiv Detail & Related papers (2022-12-09T14:38:09Z)
Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks. Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts. Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z)
Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z)
A contextual analysis of multi-layer perceptron models in classifying hand-written digits and letters: limited resources [0.0]
We extensively test an end-to-end vanilla neural network (MLP) approach in pure numpy without any pre-processing or feature extraction done beforehand. We show that basic data mining operations can significantly improve the performance of the models in terms of computational time.
arXiv Detail & Related papers (2021-07-05T04:30:37Z)
An Ensemble of Simple Convolutional Neural Network Models for MNIST Digit Recognition [0.0]
A very high accuracy on the MNIST test set can be achieved by using simple convolutional neural network (CNN) models. A two-layer ensemble, a heterogeneous ensemble of three homogeneous ensemble networks, can achieve up to 99.91% test accuracy.
arXiv Detail & Related papers (2020-08-12T09:27:05Z)
Machine learning for complete intersection Calabi-Yau manifolds: a methodological study [0.0]
We revisit the question of predicting Hodge numbers $h1,1$ and $h2,1$ of complete Calabi-Yau intersections using machine learning (ML) We obtain 97% (resp. 99%) accuracy for $h1,1$ using a neural network inspired by the Inception model for the old dataset, using only 30% (resp. 70%) of the data for training. For the new one, a simple linear regression leads to almost 100% accuracy with 30% of the data for training.
arXiv Detail & Related papers (2020-07-30T19:43:49Z)
OvA-INN: Continual Learning with Invertible Neural Networks [0.0]
OvA-INN is able to learn one class at a time and without storing any of the previous data. We show that we can take advantage of pretrained models by stacking an Invertible Network on top of a feature extractor.
arXiv Detail & Related papers (2020-06-24T14:40:05Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.