Related papers: An Artificial Intelligence System for Combined Fruit Detection and Georeferencing, Using RTK-Based Perspective Projection in Drone Imagery

An Artificial Intelligence System for Combined Fruit Detection and Georeferencing, Using RTK-Based Perspective Projection in Drone Imagery

URL: http://arxiv.org/abs/2101.00339v1
Date: Fri, 1 Jan 2021 23:39:55 GMT
Title: An Artificial Intelligence System for Combined Fruit Detection and Georeferencing, Using RTK-Based Perspective Projection in Drone Imagery
Authors: Angus Baird and Stefano Giani
Abstract summary: This work presents an Artificial Intelligence (AI) system, which detects and counts apples from aerial drone imagery of commercial orchards. To reduce computational cost, a novel precursory stage to the network is designed to preprocess raw imagery into cropped images of individual trees. Unique geospatial identifiers are allocated to these using the perspective projection model. Experiments show that a k-means clustering approach, never before seen in literature for Faster R-CNN, resulted in the most significant improvements to calibrated mAP.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work presents an Artificial Intelligence (AI) system, based on the Faster Region-Based Convolution Neural Network (Faster R-CNN) framework, which detects and counts apples from oblique, aerial drone imagery of giant commercial orchards. To reduce computational cost, a novel precursory stage to the network is designed to preprocess raw imagery into cropped images of individual trees. Unique geospatial identifiers are allocated to these using the perspective projection model. This employs Real-Time Kinematic (RTK) data, Digital Terrain and Surface Models (DTM and DSM), as well as internal and external camera parameters. The bulk of experiments however focus on tuning hyperparameters in the detection network itself. Apples which are on trees and apples which are on the ground are treated as separate classes. A mean Average Precision (mAP) metric, calibrated by the size of the two classes, is devised to mitigate spurious results. Anchor box design is of key interest due to the scale of the apples. As such, a k-means clustering approach, never before seen in literature for Faster R-CNN, resulted in the most significant improvements to calibrated mAP. Other experiments showed that the maximum number of box proposals should be 225; the initial learning rate of 0.001 is best applied to the adaptive RMS Prop optimiser; and ResNet 101 is the ideal base feature extractor when considering mAP and, to a lesser extent, inference time. The amalgamation of the optimal hyperparameters leads to a model with a calibrated mAP of 0.7627.

Related papers

MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruning [7.262751938473306]
Pruning is a well-established technique that reduces the size of neural networks while mathematically guaranteeing accuracy preservation. We develop a new pruning algorithm, MPruner, that leverages mutual information through vector similarity. MPruner achieved up to a 50% reduction in parameters and memory usage for CNN and transformer-based models, with minimal to no loss in accuracy.
arXiv Detail & Related papers (2024-08-24T05:54:47Z)
Memory-efficient particle filter recurrent neural network for object localization [53.68402839500528]
This study proposes a novel memory-efficient recurrent neural network (RNN) architecture specified to solve the object localization problem. We take the idea of the classical particle filter and combine it with GRU RNN architecture. In our experiments, the mePFRNN model provides more precise localization than the considered competitors and requires fewer trained parameters.
arXiv Detail & Related papers (2023-10-02T19:41:19Z)
Heuristic Hyperparameter Choice for Image Anomaly Detection [0.3867363075280543]
Anomaly detection in images is a fundamental computer vision problem by deep learning neural network. Models are usually pretrained on a large dataset for classification tasks such as ImageNet. We aim to do the dimension reduction of Negated Principal Component Analysis (NPCA) for these features.
arXiv Detail & Related papers (2023-07-20T19:20:35Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
On the optimization and pruning for Bayesian deep learning [1.0152838128195467]
We propose a new adaptive variational Bayesian algorithm to train neural networks on weight space. The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot. Our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
arXiv Detail & Related papers (2022-10-24T05:18:08Z)
EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications [68.35683849098105]
We introduce split depth-wise transpose attention (SDTA) encoder that splits input tensors into multiple channel groups. Our EdgeNeXt model with 1.3M parameters achieves 71.2% top-1 accuracy on ImageNet-1K. Our EdgeNeXt model with 5.6M parameters achieves 79.4% top-1 accuracy on ImageNet-1K.
arXiv Detail & Related papers (2022-06-21T17:59:56Z)
Core Risk Minimization using Salient ImageNet [53.616101711801484]
We introduce the Salient Imagenet dataset with more than 1 million soft masks localizing core and spurious features for all 1000 Imagenet classes. Using this dataset, we first evaluate the reliance of several Imagenet pretrained models (42 total) on spurious features. Next, we introduce a new learning paradigm called Core Risk Minimization (CoRM) whose objective ensures that the model predicts a class using its core features.
arXiv Detail & Related papers (2022-03-28T01:53:34Z)
New SAR target recognition based on YOLO and very deep multi-canonical correlation analysis [0.1503974529275767]
This paper proposes a robust feature extraction method for SAR image target classification by adaptively fusing effective features from different CNN layers. Experiments on the MSTAR dataset demonstrate that the proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-28T18:10:26Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
Classification of Polarimetric SAR Images Using Compact Convolutional Neural Networks [24.553598498985796]
A novel and systematic classification framework is proposed for the classification of PolSAR images. It is based on a compact and adaptive implementation of CNNs using a sliding-window classification approach. The proposed approach can perform classification using smaller window sizes than deep CNNs.
arXiv Detail & Related papers (2020-11-10T17:09:11Z)
Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.