Optimizing Agricultural Research: A RAG-Based Approach to Mycorrhizal Fungi Information
- URL: http://arxiv.org/abs/2511.14765v1
- Date: Tue, 16 Sep 2025 20:21:55 GMT
- Title: Optimizing Agricultural Research: A RAG-Based Approach to Mycorrhizal Fungi Information
- Authors: Mohammad Usman Altam, Md Imtiaz Habib, Tuan Hoang,
- Abstract summary: Retrieval-Augmented Generation (RAG) represents a transformative approach within natural language processing (NLP)<n>We present the design and evaluation of a RAG-enabled system tailored for Mycophyto.<n>The framework underscores the potential of AI-driven knowledge discovery to accelerate agroecological innovation and enhance decision-making in sustainable farming systems.
- Score: 1.2349443032034277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-Augmented Generation (RAG) represents a transformative approach within natural language processing (NLP), combining neural information retrieval with generative language modeling to enhance both contextual accuracy and factual reliability of responses. Unlike conventional Large Language Models (LLMs), which are constrained by static training corpora, RAG-powered systems dynamically integrate domain-specific external knowledge sources, thereby overcoming temporal and disciplinary limitations. In this study, we present the design and evaluation of a RAG-enabled system tailored for Mycophyto, with a focus on advancing agricultural applications related to arbuscular mycorrhizal fungi (AMF). These fungi play a critical role in sustainable agriculture by enhancing nutrient acquisition, improving plant resilience under abiotic and biotic stresses, and contributing to soil health. Our system operationalizes a dual-layered strategy: (i) semantic retrieval and augmentation of domain-specific content from agronomy and biotechnology corpora using vector embeddings, and (ii) structured data extraction to capture predefined experimental metadata such as inoculation methods, spore densities, soil parameters, and yield outcomes. This hybrid approach ensures that generated responses are not only semantically aligned but also supported by structured experimental evidence. To support scalability, embeddings are stored in a high-performance vector database, allowing near real-time retrieval from an evolving literature base. Empirical evaluation demonstrates that the proposed pipeline retrieves and synthesizes highly relevant information regarding AMF interactions with crop systems, such as tomato (Solanum lycopersicum). The framework underscores the potential of AI-driven knowledge discovery to accelerate agroecological innovation and enhance decision-making in sustainable farming systems.
Related papers
- AI Co-Scientist for Knowledge Synthesis in Medical Contexts: A Proof of Concept [0.0]
We present an AI for scalable and transparent knowledge synthesis based on explicit formalization of Population, Intervention, Comparator, Outcome, and Study design (PICOS)<n>The platform integrates relational storage, vector-based semantic retrieval, and a Neo4j knowledge graph.<n>Results show that PICOS-aware and explainable natural language processing can improve the scalability, transparency, and efficiency of evidence synthesis.
arXiv Detail & Related papers (2026-01-16T23:07:58Z) - Explainable AI for Diabetic Retinopathy Detection Using Deep Learning with Attention Mechanisms and Fuzzy Logic-Based Interpretability [0.0]
This paper proposes a hybrid deep learning framework recipe for weed detection.<n>A Generative Adversarial Network (GAN)-based augmentation method was imposed to balance class robustness and better generalize the model.<n> Experimental results yield superior results with 99.33% accuracy, precision, recall, and F1-score on multi-benchmark datasets.
arXiv Detail & Related papers (2025-11-20T12:17:00Z) - Hypothesis Hunting with Evolving Networks of Autonomous Scientific Agents [52.50038914857797]
We term this process hypothesis hunting: the cumulative search for insight through sustained exploration across vast and complex hypothesis spaces.<n>We introduce AScience, a framework modeling discovery as the interaction of agents, networks, and evaluation norms, and implement it as ASCollab.<n> Experiments show that such social dynamics enable the accumulation of expert-rated results along the diversity-quality-novelty frontier.
arXiv Detail & Related papers (2025-10-08T08:47:07Z) - RAPTOR-GEN: RApid PosTeriOR GENerator for Bayesian Learning in Biomanufacturing [2.918639959397167]
We introduce RApid PosTeriOR GENerator (RAPTOR-GEN), a mechanism-informed Bayesian learning framework.<n>RAPTOR-GEN is designed to accelerate intelligent digital twin development from sparse and heterogeneous experimental data.<n>We develop a fast and robust RAPTOR-GEN algorithm with controllable error.
arXiv Detail & Related papers (2025-09-25T05:20:49Z) - A Multimodal Conversational Assistant for the Characterization of Agricultural Plots from Geospatial Open Data [0.0]
This study presents an open-source conversational assistant that integrates multimodal retrieval and large language models (LLMs)<n>The proposed architecture combines orthophotos, Sentinel-2 vegetation indices, and user-provided documents through retrieval-augmented generation (RAG)<n>Preliminary results show that the system is capable of generating clear, relevant, and context-aware responses to agricultural queries.
arXiv Detail & Related papers (2025-09-22T09:02:53Z) - CellPainTR: Generalizable Representation Learning for Cross-Dataset Cell Painting Analysis [51.56484100374058]
We introduce CellPainTR, a Transformer-based architecture designed to learn foundational representations of cellular morphology.<n>Our work represents a significant step towards creating truly foundational models for image-based profiling, enabling more reliable and scalable cross-study biological analysis.
arXiv Detail & Related papers (2025-09-02T03:30:07Z) - HySemRAG: A Hybrid Semantic Retrieval-Augmented Generation Framework for Automated Literature Synthesis and Methodological Gap Analysis [55.2480439325792]
HySemRAG is a framework that combines Extract, Transform, Load (ETL) pipelines with Retrieval-Augmented Generation (RAG)<n>System addresses limitations in existing RAG architectures through a multi-layered approach.
arXiv Detail & Related papers (2025-08-01T20:30:42Z) - PlantDeBERTa: An Open Source Language Model for Plant Science [0.0]
We present PlantDeBERTa, a high-performance, open-source language model for extracting structured knowledge from plant stress-response literature.<n>Our methodology combines transformer-based modeling with rule-enhanced linguistic post-processing and ontology-grounded entity normalization.<n>Our model is publicly released to promote transparency and accelerate cross-disciplinary innovation in computational plant science.
arXiv Detail & Related papers (2025-06-10T15:24:03Z) - GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.<n>Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks.<n>It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation [50.26966969163348]
Large Language Models (LLMs) have shown great potential in the biomedical domain with the advancement of retrieval-augmented generation (RAG)
Existing retrieval-augmented approaches face challenges in addressing diverse queries and documents, particularly for medical knowledge queries.
We propose Self-Rewarding Tree Search (SeRTS) based on Monte Carlo Tree Search (MCTS) and a self-rewarding paradigm.
arXiv Detail & Related papers (2024-06-17T06:48:31Z) - Semantic Image Segmentation with Deep Learning for Vine Leaf Phenotyping [59.0626764544669]
In this study, we use Deep Learning methods to semantically segment grapevine leaves images in order to develop an automated object detection system for leaf phenotyping.
Our work contributes to plant lifecycle monitoring through which dynamic traits such as growth and development can be captured and quantified.
arXiv Detail & Related papers (2022-10-24T14:37:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.