FOCUS on Contamination: A Geospatial Deep Learning Framework with a Noise-Aware Loss for Surface Water PFAS Prediction
- URL: http://arxiv.org/abs/2502.14894v1
- Date: Mon, 17 Feb 2025 16:57:10 GMT
- Title: FOCUS on Contamination: A Geospatial Deep Learning Framework with a Noise-Aware Loss for Surface Water PFAS Prediction
- Authors: Jowaria Khan, Alexa Friedman, Sydney Evans, Runzi Wang, Kaley Beins, David Andrews, Elizabeth Bondi-Kelly,
- Abstract summary: FOCUS is a deep learning framework with a label noise-aware loss function to predict PFAS contamination in surface water over large regions.<n>We integrate hydrological flow data, land cover information, and proximity to known PFAS sources to improve prediction accuracy.<n>Results highlight our framework's potential for scalable PFAS monitoring.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Per and polyfluoroalkyl substances (PFAS), chemicals found in products like non-stick cookware, are unfortunately persistent environmental pollutants with severe health risks. Accurately mapping PFAS contamination is crucial for guiding targeted remediation efforts and protecting public and environmental health, yet detection across large regions remains challenging due to the cost of testing and the difficulty of simulating their spread. In this work, we introduce FOCUS, a geospatial deep learning framework with a label noise-aware loss function, to predict PFAS contamination in surface water over large regions. By integrating hydrological flow data, land cover information, and proximity to known PFAS sources, our approach leverages both spatial and environmental context to improve prediction accuracy. We evaluate the performance of our approach through extensive ablation studies and comparative analyses against baselines like sparse segmentation, as well as existing scientific methods, including Kriging and pollutant transport simulations. Results highlight our framework's potential for scalable PFAS monitoring.
Related papers
- Dense Air Pollution Estimation from Sparse in-situ Measurements and Satellite Data [6.206127662604578]
This paper addresses the challenge of estimating ambient Nitrogen Dioxide (NO$$) concentrations, a key issue in public health and environmental policy.
Existing methods for satellite-based air pollution estimation model the relationship between satellite and in-situ measurements at select point locations.
Motivated by these limitations, this study introduces a novel dense estimation technique.
By utilizing a uniformly random offset sampling strategy, our method disperses the ground truth data pixel location evenly across a larger patch.
At inference, the dense estimation method can then generate a grid of estimates in a single step, significantly reducing the computational resources required to provide estimates for larger areas.
arXiv Detail & Related papers (2025-04-23T18:38:16Z) - Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction [64.4881275941927]
We present the first comprehensive evaluation of trustworthiness in a continental-scale multi-task LSTM model.
Our investigation uncovers systematic patterns of model performance disparities linked to basin characteristics.
This work serves as a timely call to action for advancing trustworthy data-driven methods for water resources management.
arXiv Detail & Related papers (2025-03-13T01:50:50Z) - Application of Analytical Hierarchical Process and its Variants on Remote Sensing Datasets [0.16532031170453743]
The river Ganga is one of the Earth's most critically important river basins.<n>It faces significant pollution challenges, making it crucial to evaluate its vulnerability for effective and targeted remediation efforts.
arXiv Detail & Related papers (2024-12-01T11:24:03Z) - Machine Learning Algorithms to Assess Site Closure Time Frames for Soil and Groundwater Contamination [0.0]
This study expands the capabilities of PyLEnM, a Python package designed for long-term environmental monitoring.
We introduce methods to estimate the timeframe required for contaminants like Sr-90 and I-129 to reach regulatory safety standards.
Our methods are illustrated using data from the Savannah River Site (SRS) F-Area, where preliminary findings reveal a notable downward trend in contaminant levels.
arXiv Detail & Related papers (2024-11-15T14:21:32Z) - Combining Observational Data and Language for Species Range Estimation [63.65684199946094]
We propose a novel approach combining millions of citizen science species observations with textual descriptions from Wikipedia.<n>Our framework maps locations, species, and text descriptions into a common space, enabling zero-shot range estimation from textual descriptions.<n>Our approach also acts as a strong prior when combined with observational data, resulting in more accurate range estimation with less data.
arXiv Detail & Related papers (2024-10-14T17:22:55Z) - Investigating Data Contamination for Pre-training Language Models [46.335755305642564]
We explore the impact of data contamination at the pre-training stage by pre-training a series of GPT-2 models.
We highlight the effect of both text contamination (textiti.e. input text of the evaluation samples) and ground-truth contamination (textiti.e. the prompts asked on the input and the desired outputs) from evaluation data.
arXiv Detail & Related papers (2024-01-11T17:24:49Z) - Monitoring water contaminants in coastal areas through ML algorithms
leveraging atmospherically corrected Sentinel-2 data [3.155658695525581]
This study pioneers a novel approach to monitor the Turbidity contaminant, integrating CatBoost Machine Learning (ML) with high-resolution data from Sentinel-2 Level-2A.
Traditional methods are labor-intensive while CatBoost offers an efficient solution, excelling in predictive accuracy.
Leveraging atmospherically corrected Sentinel-2 data through the Google Earth Engine (GEE), our study contributes to scalable and precise Turbidity monitoring.
arXiv Detail & Related papers (2024-01-08T10:20:34Z) - FLOGA: A machine learning ready dataset, a benchmark and a novel deep
learning model for burnt area mapping with Sentinel-2 [41.28284355136163]
Wildfires pose significant threats to human and animal lives, ecosystems, and socio-economic stability.
In this work, we create and introduce a machine-learning ready dataset we name FLOGA (Forest wiLdfire Observations for the Greek Area)
This dataset is unique as it comprises of satellite imagery acquired before and after a wildfire event.
We use FLOGA to provide a thorough comparison of multiple Machine Learning and Deep Learning algorithms for the automatic extraction of burnt areas.
arXiv Detail & Related papers (2023-11-06T18:42:05Z) - Efficient Real-time Smoke Filtration with 3D LiDAR for Search and Rescue
with Autonomous Heterogeneous Robotic Systems [56.838297900091426]
Smoke and dust affect the performance of any mobile robotic platform due to their reliance on onboard perception systems.
This paper proposes a novel modular computation filtration pipeline based on intensity and spatial information.
arXiv Detail & Related papers (2023-08-14T16:48:57Z) - Spectral Analysis of Marine Debris in Simulated and Observed
Sentinel-2/MSI Images using Unsupervised Classification [0.0]
This study uses Radiative Transfer Model (RTM) simulated data and data from the Multispectral Instrument (MSI) of the Sentinel-2 mission in combination with machine learning algorithms.
The results indicate that the spectral behavior of pollutants is influenced by factors such as the type of polymer and pixel coverage percentage.
These insights can guide future research in remote sensing applications for detecting marine plastic pollution.
arXiv Detail & Related papers (2023-06-26T18:46:47Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.