Explicitly incorporating spatial information to recurrent networks for
agriculture
- URL: http://arxiv.org/abs/2206.13406v1
- Date: Mon, 27 Jun 2022 15:57:42 GMT
- Title: Explicitly incorporating spatial information to recurrent networks for
agriculture
- Authors: Claus Smitt, Michael Halstead, Alireza Ahmadi, and Chris McCool
- Abstract summary: We propose novel approaches to improve the classification of deep convolutional neural networks.
We leverage available RGB-D images and robot odometry to perform inter-frame feature map spatial registration.
This information is then fused within recurrent deep learnt models, to improve their accuracy and robustness.
- Score: 4.583080280213959
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In agriculture, the majority of vision systems perform still image
classification. Yet, recent work has highlighted the potential of spatial and
temporal cues as a rich source of information to improve the classification
performance. In this paper, we propose novel approaches to explicitly capture
both spatial and temporal information to improve the classification of deep
convolutional neural networks. We leverage available RGB-D images and robot
odometry to perform inter-frame feature map spatial registration. This
information is then fused within recurrent deep learnt models, to improve their
accuracy and robustness. We demonstrate that this can considerably improve the
classification performance with our best performing spatial-temporal model
(ST-Atte) achieving absolute performance improvements for
intersection-over-union (IoU[%]) of 4.7 for crop-weed segmentation and 2.6 for
fruit (sweet pepper) segmentation. Furthermore, we show that these approaches
are robust to variable framerates and odometry errors, which are frequently
observed in real-world applications.
Related papers
- How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? [15.550663626482903]
We investigate the efficacy of data augmentations to close the domain gap in spaceborne computer vision.
We propose two novel data augmentations specifically developed to emulate the visual effects observed in orbital imagery.
arXiv Detail & Related papers (2024-10-21T08:24:46Z) - DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects [48.65846477275723]
This study proposes novel dual-current neural networks (DCNN) to improve the accuracy of fine-grained image classification.
The main novel design features for constructing a weakly supervised learning backbone model DCNN include (a) extracting heterogeneous data, (b) keeping the feature map resolution unchanged, (c) expanding the receptive field, and (d) fusing global representations and local features.
arXiv Detail & Related papers (2024-05-07T07:51:28Z) - Investigating Temporal Convolutional Neural Networks for Satellite Image
Time Series Classification: A survey [0.0]
Temporal CNNs have been employed for SITS classification tasks with encouraging results.
This paper seeks to survey this method against a plethora of other contemporary methods for SITS classification to validate the existing findings in recent literature.
Experiments are carried out on two benchmark SITS datasets with the results demonstrating that Temporal CNNs display a superior performance to the comparative benchmark algorithms.
arXiv Detail & Related papers (2022-04-13T14:08:14Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Calibrating Class Activation Maps for Long-Tailed Visual Recognition [60.77124328049557]
We present two effective modifications of CNNs to improve network learning from long-tailed distribution.
First, we present a Class Activation Map (CAMC) module to improve the learning and prediction of network classifiers.
Second, we investigate the use of normalized classifiers for representation learning in long-tailed problems.
arXiv Detail & Related papers (2021-08-29T05:45:03Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - On estimating gaze by self-attention augmented convolutions [6.015556590955813]
We propose a novel network architecture grounded on self-attention augmented convolutions to improve the quality of the learned features.
We dubbed our framework ARes-gaze, which explores our Attention-augmented ResNet (ARes-14) as twin convolutional backbones.
Results showed a decrease of the average angular error by 2.38% when compared to state-of-the-art methods on the MPIIFaceGaze data set, and a second-place on the EyeDiap data set.
arXiv Detail & Related papers (2020-08-25T14:29:05Z) - Data Augmentation via Mixed Class Interpolation using Cycle-Consistent
Generative Adversarial Networks Applied to Cross-Domain Imagery [16.870604081967866]
Machine learning driven object detection and classification within non-visible imagery has an important role in many fields.
However, such applications often suffer due to the limited quantity and variety of non-visible spectral domain imagery.
This paper proposes and evaluates a novel data augmentation approach that leverages the more readily available visible-band imagery.
arXiv Detail & Related papers (2020-05-05T18:53:38Z) - Improvement in Land Cover and Crop Classification based on Temporal
Features Learning from Sentinel-2 Data Using Recurrent-Convolutional Neural
Network (R-CNN) [1.0312968200748118]
This paper develops a novel and optimal deep learning model for pixel-based land cover and crop classification (LC&CC) based on Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN)
Fifteen classes, including major agricultural crops, were considered in this study.
The overall accuracy achieved by our proposed Pixel R-CNN was 96.5%, which showed considerable improvements in comparison with existing mainstream methods.
arXiv Detail & Related papers (2020-04-27T15:39:50Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.