Deep Learning for Automatic Quality Grading of Mangoes: Methods and
Insights
- URL: http://arxiv.org/abs/2011.11378v1
- Date: Mon, 23 Nov 2020 13:09:47 GMT
- Title: Deep Learning for Automatic Quality Grading of Mangoes: Methods and
Insights
- Authors: Shih-Lun Wu, Hsiao-Yen Tung, Yu-Lun Hsu
- Abstract summary: The paper approaches the grading task with various convolutional neural networks (CNN), a tried-and-tested deep learning technology in computer vision.
The models involved include Mask R-CNN (for background removal), the numerous past winners of the ImageNet challenge, namely AlexNet, VGGs, and ResNets.
The paper provides explainable insights into the model's working with the help of saliency maps and principal component analysis (PCA)
- Score: 1.0742675209112622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The quality grading of mangoes is a crucial task for mango growers as it
vastly affects their profit. However, until today, this process still relies on
laborious efforts of humans, who are prone to fatigue and errors. To remedy
this, the paper approaches the grading task with various convolutional neural
networks (CNN), a tried-and-tested deep learning technology in computer vision.
The models involved include Mask R-CNN (for background removal), the numerous
past winners of the ImageNet challenge, namely AlexNet, VGGs, and ResNets; and,
a family of self-defined convolutional autoencoder-classifiers (ConvAE-Clfs)
inspired by the claimed benefit of multi-task learning in classification tasks.
Transfer learning is also adopted in this work via utilizing the ImageNet
pretrained weights. Besides elaborating on the preprocessing techniques,
training details, and the resulting performance, we go one step further to
provide explainable insights into the model's working with the help of saliency
maps and principal component analysis (PCA). These insights provide a succinct,
meaningful glimpse into the intricate deep learning black box, fostering trust,
and can also be presented to humans in real-world use cases for reviewing the
grading results.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations [61.21923643289266]
Chain of Manipulations is a mechanism that enables Vision-Language Models to solve problems step-by-step with evidence.
After training, models can solve various visual problems by eliciting intrinsic manipulations (e.g., grounding, zoom in) actively without involving external tools.
Our trained model, textbfCogCoM, achieves state-of-the-art performance across 9 benchmarks from 4 categories.
arXiv Detail & Related papers (2024-02-06T18:43:48Z) - CAManim: Animating end-to-end network activation maps [0.2509487459755192]
We propose a novel XAI visualization method denoted CAManim that seeks to broaden and focus end-user understanding of CNN predictions.
We additionally propose a novel quantitative assessment that expands upon the Remove and Debias (ROAD) metric.
This builds upon prior research to address the increasing demand for interpretable, robust, and transparent model assessment methodology.
arXiv Detail & Related papers (2023-12-19T01:07:36Z) - Look-Ahead Selective Plasticity for Continual Learning of Visual Tasks [9.82510084910641]
We propose a new mechanism that takes place during task boundaries, i.e., when one task finishes and another starts.
We evaluate the proposed methods on benchmark computer vision datasets including CIFAR10 and TinyImagenet.
arXiv Detail & Related papers (2023-11-02T22:00:23Z) - Differentiable Weight Masks for Domain Transfer [2.008400316189417]
One of the major drawbacks of deep learning models for computer vision has been their inability to retain multiple sources of information in a modular fashion.
We study three such weight masking methods to analyse their ability to mitigate "forgetting" on the source task.
We find that different masking techniques have trade-offs in retaining knowledge in the source task without adversely affecting target task performance.
arXiv Detail & Related papers (2023-08-26T20:45:52Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - PRSNet: A Masked Self-Supervised Learning Pedestrian Re-Identification
Method [2.0411082897313984]
This paper designs a pre-task of mask reconstruction to obtain a pre-training model with strong robustness.
The training optimization of the network is performed by improving the triplet loss based on the centroid.
This method achieves about 5% higher mAP on Marker1501 and CUHK03 data than existing self-supervised learning pedestrian re-identification methods.
arXiv Detail & Related papers (2023-03-11T07:20:32Z) - Explainability-aided Domain Generalization for Image Classification [0.0]
We show that applying methods and architectures from the explainability literature can achieve state-of-the-art performance for the challenging task of domain generalization.
We develop a set of novel algorithms including DivCAM, an approach where the network receives guidance during training via gradient based class activation maps to focus on a diverse set of discriminative features.
Since these methods offer competitive performance on top of explainability, we argue that the proposed methods can be used as a tool to improve the robustness of deep neural network architectures.
arXiv Detail & Related papers (2021-04-05T02:27:01Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.