A Water Efficiency Dataset for African Data Centers
- URL: http://arxiv.org/abs/2412.03716v2
- Date: Fri, 06 Dec 2024 04:40:40 GMT
- Title: A Water Efficiency Dataset for African Data Centers
- Authors: Noah Shumba, Opelo Tshekiso, Pengfei Li, Giulia Fanti, Shaolei Ren,
- Abstract summary: This paper presents the first-of-its-kind dataset to estimate water usage efficiency for data centers in 41 African countries across five different climate regions.<n>We also use our dataset to evaluate and estimate the water consumption of inference on two large language models.
- Score: 26.283078610945356
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: AI computing and data centers consume a large amount of freshwater, both directly for cooling and indirectly for electricity generation. While most attention has been paid to developed countries such as the U.S., this paper presents the first-of-its-kind dataset that combines nation-level weather and electricity generation data to estimate water usage efficiency for data centers in 41 African countries across five different climate regions. We also use our dataset to evaluate and estimate the water consumption of inference on two large language models (i.e., Llama-3-70B and GPT-4) in 11 selected African countries. Our findings show that writing a 10-page report using Llama-3-70B could consume about \textbf{0.7 liters} of water, while the water consumption by GPT-4 for the same task may go up to about 60 liters. For writing a medium-length email of 120-200 words, Llama-3-70B and GPT-4 could consume about \textbf{0.13 liters} and 3 liters of water, respectively. Interestingly, given the same AI model, 8 out of the 11 selected African countries consume less water than the global average, mainly because of lower water intensities for electricity generation. However, water consumption can be substantially higher in some African countries with a steppe climate than the U.S. and global averages, prompting more attention when deploying AI computing in these countries. Our dataset is publicly available on \href{https://huggingface.co/datasets/masterlion/WaterEfficientDatasetForAfricanCountries/tree/main}{Hugging Face}.
Related papers
- The Environmental Impact of AI Servers and Sustainable Solutions [0.0]
This study evaluates the environmental footprint of AI server operations.<n>Projections indicate that global data center electricity demand may increase from approximately 415 TWh in 2024 to nearly 945 TWh by 2030.<n>In the United States alone, AI servers are expected to drive annual increases in water consumption of 200--300 billion gallons and add 24--44 million metric tons of CO2 quivalent emissions by 2030.
arXiv Detail & Related papers (2025-12-24T01:09:06Z) - How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference [0.0]
This paper introduces a novel infrastructure-aware benchmarking framework for quantifying the environmental footprint of AI inference across 30 state-of-the-art models as deployed in commercial data centers.<n>Our results show that o3 and DeepSeek-R1 emerge as the most energy-intensive models, consuming over 33 Wh per long prompt, more than 70 times the consumption of GPT-4.1 nano, and that Claude-3.7 Sonnet ranks highest in eco-efficiency.<n>These findings illustrate a growing paradox: Although AI is becoming cheaper and faster, its global adoption drives disproportionate resource consumption.
arXiv Detail & Related papers (2025-05-14T17:47:00Z) - BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response [50.76124284445902]
Building damage assessment (BDA) is an essential capability in the aftermath of a disaster to reduce human casualties.<n>Recent research focuses on the development of AI models to achieve accurate mapping of unseen disaster events.<n>We present a BDA dataset using veRy-hIGH-resoluTion optical and SAR imagery (BRIGHT) to support AI-based all-weather disaster response.
arXiv Detail & Related papers (2025-01-10T14:57:18Z) - Mapping waterways worldwide with deep learning [0.0]
We present a computer vision model that can draw waterways based on 10m Sentinel-2 satellite imagery and the 30m GLO-30 Copernicus digital elevation model.<n>In total, we add 124 million kilometers of waterways to the 54 million kilometers already in the TDX-Hydro dataset.
arXiv Detail & Related papers (2024-11-24T04:59:07Z) - Environmental Burden of United States Data Centers in the Artificial Intelligence Era [0.5025737475817937]
Data centers generated more than 105 million tons of CO$_2$e (2.18% of US emissions in 2023)
Data centers' carbon intensity - the amount of CO$_2$e emitted per unit of electricity consumed - exceeded the US average by 48%.
Our data pipeline and visualization tools can be used to assess current and future environmental impacts of data centers.
arXiv Detail & Related papers (2024-11-14T19:55:49Z) - A Dataset for Research on Water Sustainability [18.979261592551676]
We build a dataset for operation direct water usage in the cooling systems and indirect water embedded in electricity generation.
Our dataset consists of the hourly water efficiency of major U.S. cities and states from 2019 to 2023.
We present a preliminary analysis of our dataset and discuss three potential applications that can benefit from it.
arXiv Detail & Related papers (2024-05-24T02:59:52Z) - Making AI Less "Thirsty": Uncovering and Addressing the Secret Water
Footprint of AI Models [34.93600962447119]
Training GPT-3 in Microsoft's state-of-the-art U.S. data centers can directly evaporate 700,000 liters of clean freshwater.
The global AI demand may be accountable for 4.2 -- 6.6 billion cubic meters of water withdrawal in 2027.
To respond to the global water challenges, AI models can, and also must, take social responsibility and lead by example.
arXiv Detail & Related papers (2023-04-06T17:55:27Z) - An evaluation of deep learning models for predicting water depth
evolution in urban floods [59.31940764426359]
We compare different deep learning models for prediction of water depth at high spatial resolution.
Deep learning models are trained to reproduce the data simulated by the CADDIES cellular-automata flood model.
Our results show that the deep learning models present in general lower errors compared to the other methods.
arXiv Detail & Related papers (2023-02-20T16:08:54Z) - AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages [45.88640066767242]
Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents.
Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets.
In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages.
arXiv Detail & Related papers (2023-02-17T15:40:12Z) - MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity
Recognition [55.95128479289923]
African languages are spoken by over a billion people, but are underrepresented in NLP research and development.
We create the largest human-annotated NER dataset for 20 African languages.
We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points.
arXiv Detail & Related papers (2022-10-22T08:53:14Z) - Comprehensive Benchmark Datasets for Amharic Scene Text Detection and
Recognition [56.048783994698425]
Ethiopic/Amharic script is one of the oldest African writing systems, which serves at least 23 languages in East Africa.
The Amharic writing system, Abugida, has 282 syllables, 15 punctuation marks, and 20 numerals.
We presented the first comprehensive public datasets named HUST-ART, HUST-AST, ABE, and Tana for Amharic script detection and recognition in the natural scene.
arXiv Detail & Related papers (2022-03-23T03:19:35Z) - Climate Change & Computer Audition: A Call to Action and Overview on
Audio Intelligence to Help Save the Planet [98.97255654573662]
This work provides an overview of areas in which audio intelligence can contribute to overcome climate-related challenges.
We categorise potential computer audition applications according to the five elements of earth, water, air, fire, and aether.
arXiv Detail & Related papers (2022-03-10T13:32:31Z) - Jalisco's multiclass land cover analysis and classification using a
novel lightweight convnet with real-world multispectral and relief data [51.715517570634994]
We present our novel lightweight (only 89k parameters) Convolution Neural Network (ConvNet) to make LC classification and analysis.
In this work, we combine three real-world open data sources to obtain 13 channels.
Our embedded analysis anticipates the limited performance in some classes and gives us the opportunity to group the most similar.
arXiv Detail & Related papers (2022-01-26T14:58:51Z) - Towards Sustainable Energy-Efficient Data Centers in Africa [0.0]
By 2040, 14 percent of global emissions will come from data centers.
This paper presents early findings in the use AI and digital twins to model and optimize data center operations.
arXiv Detail & Related papers (2021-09-09T07:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.