LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction
- URL: http://arxiv.org/abs/2407.16709v1
- Date: Sun, 21 Jul 2024 16:07:43 GMT
- Title: LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction
- Authors: Guanjin Wang, Junyu Xuan, Penghao Wang, Chengdao Li, Jie Lu,
- Abstract summary: We propose a new LSTM autoencoder-based model for barley genotype-to-phenotype prediction, specifically for flowering time and grain yield estimation.
Our model outperformed the other baseline methods, demonstrating its potential in handling complex high-dimensional agricultural datasets.
- Score: 16.99449054451577
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Artificial Intelligence (AI) has emerged as a key driver of precision agriculture, facilitating enhanced crop productivity, optimized resource use, farm sustainability, and informed decision-making. Also, the expansion of genome sequencing technology has greatly increased crop genomic resources, deepening our understanding of genetic variation and enhancing desirable crop traits to optimize performance in various environments. There is increasing interest in using machine learning (ML) and deep learning (DL) algorithms for genotype-to-phenotype prediction due to their excellence in capturing complex interactions within large, high-dimensional datasets. In this work, we propose a new LSTM autoencoder-based model for barley genotype-to-phenotype prediction, specifically for flowering time and grain yield estimation, which could potentially help optimize yields and management practices. Our model outperformed the other baseline methods, demonstrating its potential in handling complex high-dimensional agricultural datasets and enhancing crop phenotype prediction performance.
Related papers
- Automatically Learning Hybrid Digital Twins of Dynamical Systems [56.69628749813084]
Digital Twins (DTs) simulate the states and temporal dynamics of real-world systems.
DTs often struggle to generalize to unseen conditions in data-scarce settings.
In this paper, we propose an evolutionary algorithm ($textbfHDTwinGen$) to autonomously propose, evaluate, and optimize HDTwins.
arXiv Detail & Related papers (2024-10-31T07:28:22Z) - Disentangling Genotype and Environment Specific Latent Features for Improved Trait Prediction using a Compositional Autoencoder [1.137896937254823]
This study introduces a compositional autoencoder framework to improve trait prediction in plant breeding and genetics programs.
By disentangling latent features, the CAE provides powerful tool for precision breeding and genetic research.
arXiv Detail & Related papers (2024-10-25T18:30:27Z) - Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models [35.084222907099644]
We develop FREEFORM, Free-flow Reasoning and Ensembling for Enhanced Feature Output and Robust Modeling.
FreeFORM is available as open-source framework at GitHub: https://github.com/PennShenLab/FREEFORM.
arXiv Detail & Related papers (2024-10-02T17:53:08Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioEL provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction.
We show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.
arXiv Detail & Related papers (2023-12-15T14:04:23Z) - Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity)
We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks.
Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z) - Integrating processed-based models and machine learning for crop yield
prediction [1.3107669223114085]
In this work we investigate potato yield prediction using a hybrid meta-modeling approach.
A crop growth model is employed to generate synthetic data for (pre)training a convolutional neural net.
When applied in silico, our meta-modeling approach yields better predictions than a baseline comprising a purely data-driven approach.
arXiv Detail & Related papers (2023-07-25T12:51:25Z) - Guiding Generative Language Models for Data Augmentation in Few-Shot
Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance.
Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z) - Genetically Optimized Prediction of Remaining Useful Life [4.115847582689283]
We implement LSTM and GRU models and compare the obtained results with a proposed genetically trained neural network.
We hope to improve the consistency of the predictions by adding another layer of optimization using Genetic Algorithms.
These models and the proposed architecture are tested on the NASA Turbofan Jet Engine dataset.
arXiv Detail & Related papers (2021-02-17T16:09:23Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z) - GeneCAI: Genetic Evolution for Acquiring Compact AI [36.04715576228068]
Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy.
Model compression techniques can be leveraged to efficiently deploy such compute-intensive architectures on resource-limited mobile devices.
This paper introduces GeneCAI, a novel optimization method that automatically learns how to tune per-layer compression hyper- parameters.
arXiv Detail & Related papers (2020-04-08T20:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.