DiTEC-WDN: A Large-Scale Dataset of Hydraulic Scenarios across Multiple Water Distribution Networks
- URL: http://arxiv.org/abs/2503.17167v2
- Date: Mon, 24 Mar 2025 14:40:40 GMT
- Title: DiTEC-WDN: A Large-Scale Dataset of Hydraulic Scenarios across Multiple Water Distribution Networks
- Authors: Huy Truong, Andrés Tello, Alexander Lazovik, Victoria Degeler,
- Abstract summary: This dataset comprises 36,000 unique scenarios simulated over either short-term (24 hours) or long-term (1 year) periods.<n>DiTEC-WDN can support a variety of machine-learning tasks, including graph-level, node-level, and link-level regression, as well as time-series forecasting.<n>This contribution, released under a public license, encourages open scientific research in the critical water sector, eliminates the risk of exposing sensitive data, and fulfills the need for a large-scale water distribution network benchmark for study comparisons and scenario analysis.
- Score: 41.94295877935867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Privacy restrictions hinder the sharing of real-world Water Distribution Network (WDN) models, limiting the application of emerging data-driven machine learning, which typically requires extensive observations. To address this challenge, we propose the dataset DiTEC-WDN that comprises 36,000 unique scenarios simulated over either short-term (24 hours) or long-term (1 year) periods. We constructed this dataset using an automated pipeline that optimizes crucial parameters (e.g., pressure, flow rate, and demand patterns), facilitates large-scale simulations, and records discrete, synthetic but hydraulically realistic states under standard conditions via rule validation and post-hoc analysis. With a total of 228 million generated graph-based states, DiTEC-WDN can support a variety of machine-learning tasks, including graph-level, node-level, and link-level regression, as well as time-series forecasting. This contribution, released under a public license, encourages open scientific research in the critical water sector, eliminates the risk of exposing sensitive data, and fulfills the need for a large-scale water distribution network benchmark for study comparisons and scenario analysis.
Related papers
- GeoFUSE: A High-Efficiency Surrogate Model for Seawater Intrusion Prediction and Uncertainty Reduction [0.10923877073891446]
Seawater intrusion into coastal aquifers poses a significant threat to groundwater resources.
We develop GeoFUSE, a novel deep-learning-based surrogate framework.
We apply GeoFUSE to a 2D cross-section of the Beaver Creek tidal stream-floodplain system in Washington State.
arXiv Detail & Related papers (2024-10-26T08:10:32Z) - SEN12-WATER: A New Dataset for Hydrological Applications and its Benchmarking [40.996860106131244]
Climate and increasing droughts pose significant challenges to water resource management around the world.
We present a new dataset, SEN12-WATER, along with a benchmark using a end-to-end Deep Learning framework for proactive drought-related analysis.
arXiv Detail & Related papers (2024-09-25T16:50:59Z) - Physics-Informed Graph Neural Networks for Water Distribution Systems [3.9675504428227457]
Water distribution systems (WDS) are an integral part of critical infrastructure which is pivotal to urban development.
We propose a physics-informed deep learning (DL) model, for hydraulic state estimation in WDS.
Our model uses hydraulic principles to infer two additional hydraulic state features in the process of reconstructing the available ground truth feature.
arXiv Detail & Related papers (2024-03-27T13:51:26Z) - TransGlow: Attention-augmented Transduction model based on Graph Neural
Networks for Water Flow Forecasting [4.915744683251151]
Hydrometric prediction of water quantity is useful for a variety of applications, including water management, flood forecasting, and flood control.
We propose atemporal forecasting model that augments the hidden state in Graph Convolution Recurrent Neural Network (GCRN) encoder-decoder.
We present a new benchmark dataset of water flow from a network of Canadian stations on rivers, streams, and lakes.
arXiv Detail & Related papers (2023-12-10T18:23:40Z) - Graph Neural Networks for Pressure Estimation in Water Distribution
Systems [44.99833362998488]
Pressure and flow estimation in Water Distribution Networks (WDN) allows water management companies to optimize their control operations.
We combine physics-based modeling and Graph Neural Networks (GNN), a data-driven approach, to address the pressure estimation problem.
Our GNN-based model estimates the pressure of a large-scale WDN in The Netherlands with a MAE of 1.94mH$$O and a MAPE of 7%.
arXiv Detail & Related papers (2023-11-17T15:30:12Z) - Long-term drought prediction using deep neural networks based on geospatial weather data [75.38539438000072]
High-quality drought forecasting up to a year in advance is critical for agriculture planning and insurance.
We tackle drought data by introducing an end-to-end approach that adopts a systematic end-to-end approach.
Key findings are the exceptional performance of a Transformer model, EarthFormer, in making accurate short-term (up to six months) forecasts.
arXiv Detail & Related papers (2023-09-12T13:28:06Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results [73.98594459933008]
Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems.
This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets.
We introduce the Wild Face Anti-Spoofing dataset, a large-scale, diverse FAS dataset collected in unconstrained settings.
arXiv Detail & Related papers (2023-04-12T10:29:42Z) - An evaluation of deep learning models for predicting water depth
evolution in urban floods [59.31940764426359]
We compare different deep learning models for prediction of water depth at high spatial resolution.
Deep learning models are trained to reproduce the data simulated by the CADDIES cellular-automata flood model.
Our results show that the deep learning models present in general lower errors compared to the other methods.
arXiv Detail & Related papers (2023-02-20T16:08:54Z) - A Bayesian Generative Adversarial Network (GAN) to Generate Synthetic
Time-Series Data, Application in Combined Sewer Flow Prediction [3.3139597764446607]
In machine learning, generative models are a class of methods capable of learning data distribution to generate artificial data.
In this study, we developed a GAN model to generate synthetic time series to balance our limited recorded time series data.
The aim is to predict the flow using precipitation data and examine the impact of data augmentation using synthetic data in model performance.
arXiv Detail & Related papers (2023-01-31T16:12:26Z) - Learning Large-scale Subsurface Simulations with a Hybrid Graph Network
Simulator [57.57321628587564]
We introduce Hybrid Graph Network Simulator (HGNS) for learning reservoir simulations of 3D subsurface fluid flows.
HGNS consists of a subsurface graph neural network (SGNN) to model the evolution of fluid flows, and a 3D-U-Net to model the evolution of pressure.
Using an industry-standard subsurface flow dataset (SPE-10) with 1.1 million cells, we demonstrate that HGNS is able to reduce the inference time up to 18 times compared to standard subsurface simulators.
arXiv Detail & Related papers (2022-06-15T17:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.