Predicting Mycotoxin Contamination in Irish Oats Using Deep and Transfer Learning
- URL: http://arxiv.org/abs/2512.22243v1
- Date: Tue, 23 Dec 2025 20:08:50 GMT
- Title: Predicting Mycotoxin Contamination in Irish Oats Using Deep and Transfer Learning
- Authors: Alan Inglis, Fiona Doohan, Subramani Natarajan, Breige McNulty, Chris Elliott, Anne Nugent, Julie Meneely, Brett Greer, Stephen Kildea, Diana Bucur, Martin Danaher, Melissa Di Rocco, Lisa Black, Adam Gauley, Naoise McKenna, Andrew Parnell,
- Abstract summary: Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity.<n>This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mycotoxin contamination poses a significant risk to cereal crop quality, food safety, and agricultural productivity. Accurate prediction of mycotoxin levels can support early intervention strategies and reduce economic losses. This study investigates the use of neural networks and transfer learning models to predict mycotoxin contamination in Irish oat crops as a multi-response prediction task. Our dataset comprises oat samples collected in Ireland, containing a mix of environmental, agronomic, and geographical predictors. Five modelling approaches were evaluated: a baseline multilayer perceptron (MLP), an MLP with pre-training, and three transfer learning models; TabPFN, TabNet, and FT-Transformer. Model performance was evaluated using regression (RMSE, $R^2$) and classification (AUC, F1) metrics, with results reported per toxin and on average. Additionally, permutation-based variable importance analysis was conducted to identify the most influential predictors across both prediction tasks. The transfer learning approach TabPFN provided the overall best performance, followed by the baseline MLP. Our variable importance analysis revealed that weather history patterns in the 90-day pre-harvest period were the most important predictors, alongside seed moisture content.
Related papers
- Investigating the Impact of Histopathological Foundation Models on Regressive Prediction of Homologous Recombination Deficiency [52.50039435394964]
We systematically evaluate foundation models for regression-based tasks.<n>We extract patch-level features from whole slide images (WSI) using five state-of-the-art foundation models.<n>Models are trained to predict continuous HRD scores based on these extracted features across breast, endometrial, and lung cancer cohorts.
arXiv Detail & Related papers (2026-01-29T14:06:50Z) - FrogDeepSDM: Improving Frog Counting and Occurrence Prediction Using Multimodal Data and Pseudo-Absence Imputation [0.9537146822132906]
Species Distribution Modelling (SDM) helps predict species presence across large regions.<n>In this study, we enhance SDM accuracy for frogs (Anura) by applying deep learning and data imputation techniques.<n>Experiments show that data balancing significantly improved model performance, reducing the Mean Absolute Error (MAE) from 189 to 29 in frog counting tasks.
arXiv Detail & Related papers (2025-10-22T07:09:36Z) - A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation [55.53426007439564]
Estimating individualized treatment effects from observational data is a central challenge in causal inference.<n>In inverse probability weighting (IPW) is a well-established solution to this problem, but its integration into modern deep learning frameworks remains limited.<n>We propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation.
arXiv Detail & Related papers (2025-05-16T17:00:52Z) - A multi-locus predictiveness curve and its summary assessment for genetic risk prediction [5.050463389414008]
We propose a multi-marker predictiveness curve and a non-parametric method to construct the curve for case-control studies.<n>We also demonstrate the connections of predictiveness curve with ROC curve and Lorenz curve.<n>We conducted a real data analysis, using predictiveness curve and predictiveness U to evaluate a risk prediction model for Nicotine Dependence.
arXiv Detail & Related papers (2025-03-28T15:49:39Z) - Crop Yield Time-Series Data Prediction Based on Multiple Hybrid Machine Learning Models [6.10631040784366]
This study focuses on crop yield Time-Series Data prediction.<n>Considering the crucial significance of agriculture in the global economy and social stability, this research uses a dataset containing multiple crops, multiple regions, and data over many years.<n>Multiple hybrid machine learning models such as Linear Regression, Random Forest, Gradient Boost, XGBoost, KNN, Decision Tree, and Bagging Regressor are adopted for yield prediction.
arXiv Detail & Related papers (2025-01-21T23:41:33Z) - Scaling Laws for Predicting Downstream Performance in LLMs [75.28559015477137]
This work focuses on the pre-training loss as a more computation-efficient metric for performance estimation.<n>We present FLP-M, a fundamental approach for performance prediction that addresses the practical need to integrate datasets from multiple sources during pre-training.
arXiv Detail & Related papers (2024-10-11T04:57:48Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - Comparative Analysis of Machine Learning Approaches to Analyze and
Predict the Covid-19 Outbreak [10.307715136465056]
We present a comparative analysis of various machine learning (ML) approaches in predicting the COVID-19 outbreak in the epidemiological domain.
The results reveal the advantages of ML algorithms for supporting decision making of evolving short term policies.
arXiv Detail & Related papers (2021-02-11T11:57:33Z) - STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological
Regularization [76.57716281104938]
We develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously.
STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete-time difference equations.
We conduct experiments using both county- and state-level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic.
arXiv Detail & Related papers (2020-12-08T21:21:47Z) - Coupling Machine Learning and Crop Modeling Improves Crop Yield
Prediction in the US Corn Belt [2.580765958706854]
This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt.
The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction.
arXiv Detail & Related papers (2020-07-28T16:22:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.