LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction
- URL: http://arxiv.org/abs/2407.16709v1
- Date: Sun, 21 Jul 2024 16:07:43 GMT
- Title: LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction
- Authors: Guanjin Wang, Junyu Xuan, Penghao Wang, Chengdao Li, Jie Lu,
- Abstract summary: We propose a new LSTM autoencoder-based model for barley genotype-to-phenotype prediction, specifically for flowering time and grain yield estimation.
Our model outperformed the other baseline methods, demonstrating its potential in handling complex high-dimensional agricultural datasets.
- Score: 16.99449054451577
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Artificial Intelligence (AI) has emerged as a key driver of precision agriculture, facilitating enhanced crop productivity, optimized resource use, farm sustainability, and informed decision-making. Also, the expansion of genome sequencing technology has greatly increased crop genomic resources, deepening our understanding of genetic variation and enhancing desirable crop traits to optimize performance in various environments. There is increasing interest in using machine learning (ML) and deep learning (DL) algorithms for genotype-to-phenotype prediction due to their excellence in capturing complex interactions within large, high-dimensional datasets. In this work, we propose a new LSTM autoencoder-based model for barley genotype-to-phenotype prediction, specifically for flowering time and grain yield estimation, which could potentially help optimize yields and management practices. Our model outperformed the other baseline methods, demonstrating its potential in handling complex high-dimensional agricultural datasets and enhancing crop phenotype prediction performance.
Related papers
- GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.
The model adheres to the central dogma of molecular biology, accurately generating protein-coding sequences.
It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of promoter sequences.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - Integrating remote sensing data assimilation, deep learning and large language model for interactive wheat breeding yield prediction [6.955215132571773]
This study introduces a hybrid method and tool for crop yield prediction, designed to allow breeders to interactively and accurately predict wheat yield by chatting with a large language model (LLM)
The newly designed data assimilation algorithm is used to assimilate the leaf area index into the WOFOST model. Then, selected outputs from the assimilation process, along with remote sensing results, are used to drive the time-series temporal fusion transformer model for wheat yield prediction.
arXiv Detail & Related papers (2025-01-08T13:14:05Z) - Enhancing weed detection performance by means of GenAI-based image augmentation [0.0]
This paper investigates a generative AI-based augmentation technique that uses the Stable Diffusion model to produce diverse synthetic images for weed detection models.
Results show substantial improvements in mean Average Precision for YOLO models trained with generative AI-augmented datasets.
arXiv Detail & Related papers (2024-11-27T17:00:34Z) - Automatically Learning Hybrid Digital Twins of Dynamical Systems [56.69628749813084]
Digital Twins (DTs) simulate the states and temporal dynamics of real-world systems.
DTs often struggle to generalize to unseen conditions in data-scarce settings.
In this paper, we propose an evolutionary algorithm ($textbfHDTwinGen$) to autonomously propose, evaluate, and optimize HDTwins.
arXiv Detail & Related papers (2024-10-31T07:28:22Z) - Disentangling Genotype and Environment Specific Latent Features for Improved Trait Prediction using a Compositional Autoencoder [1.137896937254823]
This study introduces a compositional autoencoder framework to improve trait prediction in plant breeding and genetics programs.
By disentangling latent features, the CAE provides powerful tool for precision breeding and genetic research.
arXiv Detail & Related papers (2024-10-25T18:30:27Z) - Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models [35.084222907099644]
We develop FREEFORM, Free-flow Reasoning and Ensembling for Enhanced Feature Output and Robust Modeling.
FreeFORM is available as open-source framework at GitHub: https://github.com/PennShenLab/FREEFORM.
arXiv Detail & Related papers (2024-10-02T17:53:08Z) - Dataset Distillation for Histopathology Image Classification [46.04496989951066]
We introduce a novel dataset distillation algorithm tailored for histopathology image datasets (Histo-DD)
We conduct a comprehensive evaluation of the effectiveness of the proposed algorithm and the generated histopathology samples in both patch-level and slide-level classification tasks.
arXiv Detail & Related papers (2024-08-19T05:53:38Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity)
We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks.
Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z) - Integrating processed-based models and machine learning for crop yield
prediction [1.3107669223114085]
In this work we investigate potato yield prediction using a hybrid meta-modeling approach.
A crop growth model is employed to generate synthetic data for (pre)training a convolutional neural net.
When applied in silico, our meta-modeling approach yields better predictions than a baseline comprising a purely data-driven approach.
arXiv Detail & Related papers (2023-07-25T12:51:25Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.