Related papers: Plant Species Recognition with Optimized 3D Polynomial Neural Networks and Variably Overlapping Time-Coherent Sliding Window

Plant Species Recognition with Optimized 3D Polynomial Neural Networks and Variably Overlapping Time-Coherent Sliding Window

URL: http://arxiv.org/abs/2203.02611v1
Date: Fri, 4 Mar 2022 23:37:12 GMT
Title: Plant Species Recognition with Optimized 3D Polynomial Neural Networks and Variably Overlapping Time-Coherent Sliding Window
Authors: Habib Ben Abdallah, Christopher J. Henry, Sheela Ramanna
Abstract summary: This paper proposes a novel method, called Variably Overlapping Time-Coherent Sliding Window (VOTCSW), that transforms a dataset composed of images with variable size to a 3D representation with fixed size. By combining the VOTCSW method with the 3D extension of a recently proposed machine learning model called 1-Dimensional Polynomial Neural Networks, we were able to create a model that achieved a state-of-the-art accuracy of 99.9% on the dataset created by the EAGL-I system.
Score: 3.867363075280544
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recently, the EAGL-I system was developed to rapidly create massive labeled datasets of plants intended to be commonly used by farmers and researchers to create AI-driven solutions in agriculture. As a result, a publicly available plant species recognition dataset composed of 40,000 images with different sizes consisting of 8 plant species was created with the system in order to demonstrate its capabilities. This paper proposes a novel method, called Variably Overlapping Time-Coherent Sliding Window (VOTCSW), that transforms a dataset composed of images with variable size to a 3D representation with fixed size that is suitable for convolutional neural networks, and demonstrates that this representation is more informative than resizing the images of the dataset to a given size. We theoretically formalized the use cases of the method as well as its inherent properties and we proved that it has an oversampling and a regularization effect on the data. By combining the VOTCSW method with the 3D extension of a recently proposed machine learning model called 1-Dimensional Polynomial Neural Networks, we were able to create a model that achieved a state-of-the-art accuracy of 99.9% on the dataset created by the EAGL-I system, surpassing well-known architectures such as ResNet and Inception. In addition, we created a heuristic algorithm that enables the degree reduction of any pre-trained N-Dimensional Polynomial Neural Network and which compresses it without altering its performance, thus making the model faster and lighter. Furthermore, we established that the currently available dataset could not be used for machine learning in its present form, due to a substantial class imbalance between the training set and the test set. Hence, we created a specific preprocessing and a model development framework that enabled us to improve the accuracy from 49.23% to 99.9%.

Related papers

Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z)
Enhancing Steering Estimation with Semantic-Aware GNNs [41.89219383258699]
hybrid architectures combine 3D neural network models with recurrent neural networks (RNNs) for temporal modeling. We evaluate four hybrid 3D models, all of which outperform the 2D-only baseline. We validate our approach on the KITTI dataset, achieving a 71% improvement over 2D-only models.
arXiv Detail & Related papers (2025-03-21T13:58:08Z)
A Neural Network Architecture Based on Attention Gate Mechanism for 3D Magnetotelluric Forward Modeling [1.5862483908050367]
We propose a novel neural network architecture named MTAGU-Net, which integrates an attention gating mechanism for 3D MT forward modeling. A dual-path attention gating module is designed based on forward response data images and embedded in the skip connections between the encoder and decoder. A synthetic model generation method utilizing 3D Gaussian random field (GRF) accurately replicates the electrical structures of real-world geological scenarios.
arXiv Detail & Related papers (2025-03-14T13:48:25Z)
Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds [6.69660410213287]
We propose an innovative framework called Point-MGE to explore the benefits of deeply integrating 3D representation learning and generative learning. In shape classification, Point-MGE achieved an accuracy of 94.2% (+1.0%) on the ModelNet40 dataset and 92.9% (+5.5%) on the ScanObjectNN dataset. Experimental results also confirmed that Point-MGE can generate high-quality 3D shapes in both unconditional and conditional settings.
arXiv Detail & Related papers (2024-06-25T07:57:03Z)
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches. MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z)
Learning-Based Biharmonic Augmentation for Point Cloud Classification [79.13962913099378]
Biharmonic Augmentation (BA) is a novel and efficient data augmentation technique. BA diversifies point cloud data by imposing smooth non-rigid deformations on existing 3D structures. We present AdvTune, an advanced online augmentation system that integrates adversarial training.
arXiv Detail & Related papers (2023-11-10T14:04:49Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images. We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z)
3D Generative Model Latent Disentanglement via Local Eigenprojection [13.713373496487012]
We introduce a novel loss function grounded in spectral geometry for different neural-network-based generative models of 3D head and body meshes. Experimental results show that our local eigenprojection disentangled (LED) models offer improved disentanglement with respect to the state-of-the-art.
arXiv Detail & Related papers (2023-02-24T18:19:49Z)
NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction [37.357949900603295]
We propose a neural architecture representation model that can be used to estimate attributes holistically. Experiment results show that our proposed framework can be used to predict the latency and accuracy attributes of both cell architectures and whole deep neural networks.
arXiv Detail & Related papers (2022-11-15T10:15:21Z)
Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes. Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z)
Model-inspired Deep Learning for Light-Field Microscopy with Application to Neuron Localization [27.247818386065894]
We propose a model-inspired deep learning approach to perform fast and robust 3D localization of sources using light-field microscopy images. This is achieved by developing a deep network that efficiently solves a convolutional sparse coding problem. Experiments on localization of mammalian neurons from light-fields show that the proposed approach simultaneously provides enhanced performance, interpretability and efficiency.
arXiv Detail & Related papers (2021-03-10T16:24:47Z)
Learning Local Neighboring Structure for Robust 3D Shape Representation [143.15904669246697]
Representation learning for 3D meshes is important in many computer vision and graphics applications. We propose a local structure-aware anisotropic convolutional operation (LSA-Conv) Our model produces significant improvement in 3D shape reconstruction compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-21T13:40:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.