Related papers: LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction

LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction

URL: http://arxiv.org/abs/2407.16709v1
Date: Sun, 21 Jul 2024 16:07:43 GMT
Title: LSTM Autoencoder-based Deep Neural Networks for Barley Genotype-to-Phenotype Prediction
Authors: Guanjin Wang, Junyu Xuan, Penghao Wang, Chengdao Li, Jie Lu,
Abstract summary: We propose a new LSTM autoencoder-based model for barley genotype-to-phenotype prediction, specifically for flowering time and grain yield estimation. Our model outperformed the other baseline methods, demonstrating its potential in handling complex high-dimensional agricultural datasets.
Score: 16.99449054451577
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Artificial Intelligence (AI) has emerged as a key driver of precision agriculture, facilitating enhanced crop productivity, optimized resource use, farm sustainability, and informed decision-making. Also, the expansion of genome sequencing technology has greatly increased crop genomic resources, deepening our understanding of genetic variation and enhancing desirable crop traits to optimize performance in various environments. There is increasing interest in using machine learning (ML) and deep learning (DL) algorithms for genotype-to-phenotype prediction due to their excellence in capturing complex interactions within large, high-dimensional datasets. In this work, we propose a new LSTM autoencoder-based model for barley genotype-to-phenotype prediction, specifically for flowering time and grain yield estimation, which could potentially help optimize yields and management practices. Our model outperformed the other baseline methods, demonstrating its potential in handling complex high-dimensional agricultural datasets and enhancing crop phenotype prediction performance.

Related papers

Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z)
Teaching pathology foundation models to accurately predict gene expression with parameter efficient knowledge transfer [1.5416321520529301]
Efficient Knowledge Adaptation (PEKA) is a novel framework that integrates knowledge distillation and structure alignment losses for cross-modal knowledge transfer. We evaluated PEKA for gene expression prediction using multiple spatial transcriptomics datasets.
arXiv Detail & Related papers (2025-04-09T17:24:41Z)
Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance. Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws. We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters. Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks. It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z)
Integrating remote sensing data assimilation, deep learning and large language model for interactive wheat breeding yield prediction [6.955215132571773]
This study introduces a hybrid method and tool for crop yield prediction, designed to allow breeders to interactively and accurately predict wheat yield by chatting with a large language model (LLM) The newly designed data assimilation algorithm is used to assimilate the leaf area index into the WOFOST model. Then, selected outputs from the assimilation process, along with remote sensing results, are used to drive the time-series temporal fusion transformer model for wheat yield prediction.
arXiv Detail & Related papers (2025-01-08T13:14:05Z)
Enhancing weed detection performance by means of GenAI-based image augmentation [0.0]
This paper investigates a generative AI-based augmentation technique that uses the Stable Diffusion model to produce diverse synthetic images for weed detection models. Results show substantial improvements in mean Average Precision for YOLO models trained with generative AI-augmented datasets.
arXiv Detail & Related papers (2024-11-27T17:00:34Z)
Automatically Learning Hybrid Digital Twins of Dynamical Systems [56.69628749813084]
Digital Twins (DTs) simulate the states and temporal dynamics of real-world systems. DTs often struggle to generalize to unseen conditions in data-scarce settings. In this paper, we propose an evolutionary algorithm ($textbfHDTwinGen$) to autonomously propose, evaluate, and optimize HDTwins.
arXiv Detail & Related papers (2024-10-31T07:28:22Z)
Disentangling Genotype and Environment Specific Latent Features for Improved Trait Prediction using a Compositional Autoencoder [1.137896937254823]
This study introduces a compositional autoencoder framework to improve trait prediction in plant breeding and genetics programs. By disentangling latent features, the CAE provides powerful tool for precision breeding and genetic research.
arXiv Detail & Related papers (2024-10-25T18:30:27Z)
Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models [35.084222907099644]
We develop FREEFORM, Free-flow Reasoning and Ensembling for Enhanced Feature Output and Robust Modeling. FreeFORM is available as open-source framework at GitHub: https://github.com/PennShenLab/FREEFORM.
arXiv Detail & Related papers (2024-10-02T17:53:08Z)
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions. BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model. It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z)
Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioEL provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction. We show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.
arXiv Detail & Related papers (2023-12-15T14:04:23Z)
Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z)
Integrating processed-based models and machine learning for crop yield prediction [1.3107669223114085]
In this work we investigate potato yield prediction using a hybrid meta-modeling approach. A crop growth model is employed to generate synthetic data for (pre)training a convolutional neural net. When applied in silico, our meta-modeling approach yields better predictions than a baseline comprising a purely data-driven approach.
arXiv Detail & Related papers (2023-07-25T12:51:25Z)
Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z)
Genetically Optimized Prediction of Remaining Useful Life [4.115847582689283]
We implement LSTM and GRU models and compare the obtained results with a proposed genetically trained neural network. We hope to improve the consistency of the predictions by adding another layer of optimization using Genetic Algorithms. These models and the proposed architecture are tested on the NASA Turbofan Jet Engine dataset.
arXiv Detail & Related papers (2021-02-17T16:09:23Z)
Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting. G-DAUGC consistently outperforms existing data augmentation methods based on back-translation. Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
GeneCAI: Genetic Evolution for Acquiring Compact AI [36.04715576228068]
Deep Neural Networks (DNNs) are evolving towards more complex architectures to achieve higher inference accuracy. Model compression techniques can be leveraged to efficiently deploy such compute-intensive architectures on resource-limited mobile devices. This paper introduces GeneCAI, a novel optimization method that automatically learns how to tune per-layer compression hyper- parameters.
arXiv Detail & Related papers (2020-04-08T20:56:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.