Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction
- URL: http://arxiv.org/abs/2503.09947v3
- Date: Sat, 25 Oct 2025 01:57:51 GMT
- Title: Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction
- Authors: Xiaobo Xia, Xiaofeng Liu, Jiale Liu, Kuai Fang, Lu Lu, Samet Oymak, William S. Currie, Tongliang Liu,
- Abstract summary: Water quality is foundational to environmental sustainability, ecosystem resilience, and public health.<n>Deep learning offers transformative potential for large-scale water quality prediction and scientific insights generation.<n>Their widespread adoption in high-stakes operational decision-making, such as pollution mitigation and equitable resource allocation, is prevented by unresolved trustworthiness challenges.
- Score: 69.38041171537573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Water quality is foundational to environmental sustainability, ecosystem resilience, and public health. Deep learning offers transformative potential for large-scale water quality prediction and scientific insights generation. However, their widespread adoption in high-stakes operational decision-making, such as pollution mitigation and equitable resource allocation, is prevented by unresolved trustworthiness challenges, including performance disparity, robustness, uncertainty, interpretability, generalizability, and reproducibility. In this work, we present a multi-dimensional, quantitative evaluation of trustworthiness benchmarking three state-of-the-art deep learning architectures: recurrent (LSTM), operator-learning (DeepONet), and transformer-based (Informer), trained on 37 years of data from 482 U.S. basins to predict 20 water quality variables. Our investigation reveals systematic performance disparities tied to process complexity, data availability, and basin heterogeneity. Management-critical variables remain the least predictable and most uncertain. Robustness tests reveal pronounced sensitivity to outliers and corrupted targets; notably, the architecture with the strongest baseline performance (LSTM) proves most vulnerable under data corruption. Attribution analyses align for simple variables but diverge for nutrients, underscoring the need for multi-method interpretability. Spatial generalization to ungauged basins remains poor across all models. This work serves as a timely call to action for advancing trustworthy data-driven methods for water resources management and provides a pathway to offering critical insights for researchers, decision-makers, and practitioners seeking to leverage artificial intelligence (AI) responsibly in environmental management.
Related papers
- Interpretable Hybrid Deep Q-Learning Framework for IoT-Based Food Spoilage Prediction with Synthetic Data Generation and Hardware Validation [0.5417521241272645]
The need for an intelligent, real-time spoilage prediction system has become critical in modern IoT-driven food supply chains.<n>We propose a hybrid reinforcement learning framework integrating Long Short-Term Memory (LSTM) and Recurrent Neural Networks (RNN) for enhanced spoilage prediction.
arXiv Detail & Related papers (2025-12-22T12:59:48Z) - Self-organizing maps for water quality assessment in reservoirs and lakes: A systematic literature review [0.0]
Self-Organizing Map (SOM), an unsupervised AI technique, is applied to water quality assessment.<n>SOM handles multidimensional data and uncovers hidden patterns to support effective water management.<n>This review highlights SOMs versatility in ecological assessments, trophic state classification, algal bloom monitoring, and catchment area impact evaluations.
arXiv Detail & Related papers (2025-12-20T18:48:33Z) - From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z) - AI in Agriculture: A Survey of Deep Learning Techniques for Crops, Fisheries and Livestock [77.95897723270453]
Crops, fisheries and livestock form the backbone of global food production, essential to feed the ever-growing global population.<n> Addressing these issues requires efficient, accurate, and scalable technological solutions, highlighting the importance of artificial intelligence (AI)<n>This survey presents a systematic and thorough review of more than 200 research works covering conventional machine learning approaches, advanced deep learning techniques, and recent vision-language foundation models.
arXiv Detail & Related papers (2025-07-29T17:59:48Z) - Geospatial Foundation Models to Enable Progress on Sustainable Development Goals [18.086843224361644]
Foundation Models (FMs) are large-scale, pre-trained AI systems that have revolutionized natural language processing and computer vision.<n>This study provides a rigorous, interdisciplinary assessment of geospatial FMs and offers critical insights into their role in attaining sustainability goals.
arXiv Detail & Related papers (2025-05-30T12:36:38Z) - Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG [51.120170062795566]
We propose Divide-Then-Align (DTA) to endow RAG systems with the ability to respond with "I don't know" when the query is out of the knowledge boundary.<n>DTA balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-27T08:21:21Z) - REVAL: A Comprehension Evaluation on Reliability and Values of Large Vision-Language Models [59.445672459851274]
REVAL is a comprehensive benchmark designed to evaluate the textbfREliability and textbfVALue of Large Vision-Language Models.
REVAL encompasses over 144K image-text Visual Question Answering (VQA) samples, structured into two primary sections: Reliability and Values.
We evaluate 26 models, including mainstream open-source LVLMs and prominent closed-source models like GPT-4o and Gemini-1.5-Pro.
arXiv Detail & Related papers (2025-03-20T07:54:35Z) - Integrating Boosted learning with Differential Evolution (DE) Optimizer: A Prediction of Groundwater Quality Risk Assessment in Odisha [0.0]
This study developed a machine learning-based predictive model to evaluate the Groundwater Quality Index (GWQI)
It has been achieved with the help of a hybrid machine learning model i.e. LCBoost Fusion.
arXiv Detail & Related papers (2025-02-25T07:47:41Z) - Towards Robust Stability Prediction in Smart Grids: GAN-based Approach under Data Constraints and Adversarial Challenges [53.2306792009435]
This paper introduces a novel framework for detecting instability in smart grids using only stable data.<n>It achieves up to 98.1% accuracy in predicting grid stability and 98.9% in detecting adversarial attacks.<n>Implemented on a single-board computer, it enables real-time decision-making with an average response time of under 7ms.
arXiv Detail & Related papers (2025-01-27T20:48:25Z) - Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation.
We propose methods tailored to the unique properties of perception and decision-making.
We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z) - Water quality polluted by total suspended solids classified within an Artificial Neural Network approach [0.0]
Water pollution by suspended solids poses significant environmental and health risks.
To address these challenges, we developed a model that leverages a comprehensive dataset of water quality from total suspended solids.
A convolutional neural network was trained under a transfer learning approach using data corresponding to different total suspended solids concentrations.
arXiv Detail & Related papers (2024-10-19T01:33:08Z) - Cooperative Resilience in Artificial Intelligence Multiagent Systems [2.0608564715600273]
This paper proposes a clear definition of cooperative resilience' and a methodology for its quantitative measurement.
The results highlight the crucial role of resilience metrics in analyzing how the collective system prepares for, resists, recovers from, sustains well-being, and transforms in the face of disruptions.
arXiv Detail & Related papers (2024-09-20T03:28:48Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Machine Learning for Urban Air Quality Analytics: A Survey [27.96085346957208]
Air pollution poses an urgent global concern with far-reaching consequences.
In this article, we present a comprehensive survey of Machine Learning-based air quality analytics.
arXiv Detail & Related papers (2023-10-14T17:03:29Z) - Beyond Tides and Time: Machine Learning Triumph in Water Quality [0.0]
This study aims to establish a robust predictive pipeline to both data science experts and those without domain specific knowledge.
Our research aims to establish a robust predictive pipeline to both data science experts and those without domain specific knowledge.
arXiv Detail & Related papers (2023-09-29T03:33:53Z) - The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation [97.63185634482552]
We summarize the winning solutions from the RoboDepth Challenge.
The challenge was designed to facilitate and advance robust OoD depth estimation.
We hope this challenge could lay a solid foundation for future research on robust and reliable depth estimation.
arXiv Detail & Related papers (2023-07-27T17:59:56Z) - On the Opportunities and Risks of Foundation Models [256.61956234436553]
We call these models foundation models to underscore their critically central yet incomplete character.
This report provides a thorough account of the opportunities and risks of foundation models.
To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration.
arXiv Detail & Related papers (2021-08-16T17:50:08Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z) - Predictive Analytics for Water Asset Management: Machine Learning and
Survival Analysis [55.41644538483948]
We study a statistical and machine learning framework for the prediction of water pipe failures.
We use a dataset containing the failure records of all pipes within the water distribution network in Barcelona, Spain.
The results shed light on the effect of important risk factors, such as pipe geometry, age, material, and soil cover, among others.
arXiv Detail & Related papers (2020-07-02T19:08:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.