Plant Species Recognition with Optimized 3D Polynomial Neural Networks
and Variably Overlapping Time-Coherent Sliding Window
- URL: http://arxiv.org/abs/2203.02611v1
- Date: Fri, 4 Mar 2022 23:37:12 GMT
- Title: Plant Species Recognition with Optimized 3D Polynomial Neural Networks
and Variably Overlapping Time-Coherent Sliding Window
- Authors: Habib Ben Abdallah, Christopher J. Henry, Sheela Ramanna
- Abstract summary: This paper proposes a novel method, called Variably Overlapping Time-Coherent Sliding Window (VOTCSW), that transforms a dataset composed of images with variable size to a 3D representation with fixed size.
By combining the VOTCSW method with the 3D extension of a recently proposed machine learning model called 1-Dimensional Polynomial Neural Networks, we were able to create a model that achieved a state-of-the-art accuracy of 99.9% on the dataset created by the EAGL-I system.
- Score: 3.867363075280544
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recently, the EAGL-I system was developed to rapidly create massive labeled
datasets of plants intended to be commonly used by farmers and researchers to
create AI-driven solutions in agriculture. As a result, a publicly available
plant species recognition dataset composed of 40,000 images with different
sizes consisting of 8 plant species was created with the system in order to
demonstrate its capabilities. This paper proposes a novel method, called
Variably Overlapping Time-Coherent Sliding Window (VOTCSW), that transforms a
dataset composed of images with variable size to a 3D representation with fixed
size that is suitable for convolutional neural networks, and demonstrates that
this representation is more informative than resizing the images of the dataset
to a given size. We theoretically formalized the use cases of the method as
well as its inherent properties and we proved that it has an oversampling and a
regularization effect on the data. By combining the VOTCSW method with the 3D
extension of a recently proposed machine learning model called 1-Dimensional
Polynomial Neural Networks, we were able to create a model that achieved a
state-of-the-art accuracy of 99.9% on the dataset created by the EAGL-I system,
surpassing well-known architectures such as ResNet and Inception. In addition,
we created a heuristic algorithm that enables the degree reduction of any
pre-trained N-Dimensional Polynomial Neural Network and which compresses it
without altering its performance, thus making the model faster and lighter.
Furthermore, we established that the currently available dataset could not be
used for machine learning in its present form, due to a substantial class
imbalance between the training set and the test set. Hence, we created a
specific preprocessing and a model development framework that enabled us to
improve the accuracy from 49.23% to 99.9%.
Related papers
- Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds [6.69660410213287]
We propose an innovative framework called Point-MGE to explore the benefits of deeply integrating 3D representation learning and generative learning.
In shape classification, Point-MGE achieved an accuracy of 94.2% (+1.0%) on the ModelNet40 dataset and 92.9% (+5.5%) on the ScanObjectNN dataset.
Experimental results also confirmed that Point-MGE can generate high-quality 3D shapes in both unconditional and conditional settings.
arXiv Detail & Related papers (2024-06-25T07:57:03Z) - MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches.
MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z) - Learning-Based Biharmonic Augmentation for Point Cloud Classification [79.13962913099378]
Biharmonic Augmentation (BA) is a novel and efficient data augmentation technique.
BA diversifies point cloud data by imposing smooth non-rigid deformations on existing 3D structures.
We present AdvTune, an advanced online augmentation system that integrates adversarial training.
arXiv Detail & Related papers (2023-11-10T14:04:49Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z) - 3D Generative Model Latent Disentanglement via Local Eigenprojection [13.713373496487012]
We introduce a novel loss function grounded in spectral geometry for different neural-network-based generative models of 3D head and body meshes.
Experimental results show that our local eigenprojection disentangled (LED) models offer improved disentanglement with respect to the state-of-the-art.
arXiv Detail & Related papers (2023-02-24T18:19:49Z) - NAR-Former: Neural Architecture Representation Learning towards Holistic
Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically.
Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z) - Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes.
Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z) - Model-inspired Deep Learning for Light-Field Microscopy with Application
to Neuron Localization [27.247818386065894]
We propose a model-inspired deep learning approach to perform fast and robust 3D localization of sources using light-field microscopy images.
This is achieved by developing a deep network that efficiently solves a convolutional sparse coding problem.
Experiments on localization of mammalian neurons from light-fields show that the proposed approach simultaneously provides enhanced performance, interpretability and efficiency.
arXiv Detail & Related papers (2021-03-10T16:24:47Z) - Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications.
We propose a local structure-aware anisotropic convolutional operation (LSA-Conv)
Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.