Related papers: Bag of Tricks for Retail Product Image Classification

Bag of Tricks for Retail Product Image Classification

URL: http://arxiv.org/abs/2001.03992v1
Date: Sun, 12 Jan 2020 20:20:07 GMT
Title: Bag of Tricks for Retail Product Image Classification
Authors: Muktabh Mayank Srivastava
Abstract summary: We present various tricks to increase accuracy of Deep Learning models on different types of retail product image classification datasets. New neural network layer called Local-Concepts-Accumulation (LCA) layer gives consistent gains across multiple datasets. Two other tricks we find to increase accuracy on retail product identification are using an instagram-pretrained Convnet and using Maximum Entropy as an auxiliary loss for classification.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Retail Product Image Classification is an important Computer Vision and Machine Learning problem for building real world systems like self-checkout stores and automated retail execution evaluation. In this work, we present various tricks to increase accuracy of Deep Learning models on different types of retail product image classification datasets. These tricks enable us to increase the accuracy of fine tuned convnets for retail product image classification by a large margin. As the most prominent trick, we introduce a new neural network layer called Local-Concepts-Accumulation (LCA) layer which gives consistent gains across multiple datasets. Two other tricks we find to increase accuracy on retail product identification are using an instagram-pretrained Convnet and using Maximum Entropy as an auxiliary loss for classification.

Related papers

Additional Look into GAN-based Augmentation for Deep Learning COVID-19 Image Classification [57.1795052451257]
We study the dependence of the GAN-based augmentation performance on dataset size with a focus on small samples. We train StyleGAN2-ADA with both sets and then, after validating the quality of generated images, we use trained GANs as one of the augmentations approaches in multi-class classification problems. The GAN-based augmentation approach is found to be comparable with classical augmentation in the case of medium and large datasets but underperforms in the case of smaller datasets.
arXiv Detail & Related papers (2024-01-26T08:28:13Z)
RetailKLIP : Finetuning OpenCLIP backbone using metric learning on a single GPU for Zero-shot retail product image classification [0.0]
We propose finetuning the vision encoder of a CLIP model in a way that its embeddings can be easily used for nearest neighbor based classification. A nearest neighbor based classification needs no incremental training for new products, thus saving resources and wait time.
arXiv Detail & Related papers (2023-12-16T01:23:42Z)
Scalable Federated Learning for Clients with Different Input Image Sizes and Numbers of Output Categories [34.22635158366194]
Federated learning is a privacy-preserving training method which consists of training from a plurality of clients but without sharing their confidential data. We propose an effective federated learning method named ScalableFL, where the depths and widths of the local models for each client are adjusted according to the clients' input image size and the numbers of output categories.
arXiv Detail & Related papers (2023-11-15T05:43:14Z)
Facilitated machine learning for image-based fruit quality assessment in developing countries [68.8204255655161]
Automated image classification is a common task for supervised machine learning in food science. We propose an alternative method based on pre-trained vision transformers (ViTs) It can be easily implemented with limited resources on a standard device.
arXiv Detail & Related papers (2022-07-10T19:52:20Z)
Self Supervised Learning for Few Shot Hyperspectral Image Classification [57.2348804884321]
We propose to leverage Self Supervised Learning (SSL) for HSI classification. We show that by pre-training an encoder on unlabeled pixels using Barlow-Twins, a state-of-the-art SSL algorithm, we can obtain accurate models with a handful of labels.
arXiv Detail & Related papers (2022-06-24T07:21:53Z)
Memory Classifiers: Two-stage Classification for Robustness in Machine Learning [19.450529711560964]
We propose a new method for classification which can improve robustness to distribution shifts. We combine expert knowledge about the high-level" structure of the data with standard classifiers. We show improvements which push beyond standard data augmentation techniques.
arXiv Detail & Related papers (2022-06-10T18:44:45Z)
Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images. MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z)
An Improved Deep Learning Approach For Product Recognition on Racks in Retail Stores [2.470815298095903]
Automated product recognition in retail stores is an important real-world application in the domain of Computer Vision and Pattern Recognition. We develop a two-stage object detection and recognition pipeline comprising of a Faster-RCNN-based object localizer and a ResNet-18-based image encoder. Each of the models is fine-tuned using appropriate data sets for better prediction and data augmentation is performed on each query image to prepare an extensive gallery set for fine-tuning the ResNet-18-based product recognition model.
arXiv Detail & Related papers (2022-02-26T06:51:36Z)
Using Contrastive Learning and Pseudolabels to learn representations for Retail Product Image Classification [0.0]
We use contrastive learning and pseudolabel based noisy student training to learn representations that get accuracy in order of finetuning the entire Convnet backbone for retail product image classification.
arXiv Detail & Related papers (2021-10-07T17:29:05Z)
Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution. First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers. Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z)
Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision. The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr. This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.