ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability
- URL: http://arxiv.org/abs/2406.09046v2
- Date: Sat, 6 Jul 2024 09:25:10 GMT
- Title: ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability
- Authors: Yanming Guo, Charles Guan, Jin Ma,
- Abstract summary: This paper introduces ExioML, the first Machine Learning benchmark dataset designed for sustainability analysis.
A crucial greenhouse gas emission regression task was conducted to evaluate sectoral sustainability and demonstrate the usability of the dataset.
- Score: 11.925553950065895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Environmental Extended Multi-Regional Input-Output analysis is the predominant framework in Ecological Economics for assessing the environmental impact of economic activities. This paper introduces ExioML, the first Machine Learning benchmark dataset designed for sustainability analysis, aimed at lowering barriers and fostering collaboration between Machine Learning and Ecological Economics research. A crucial greenhouse gas emission regression task was conducted to evaluate sectoral sustainability and demonstrate the usability of the dataset. We compared the performance of traditional shallow models with deep learning models, utilizing a diverse Factor Accounting table and incorporating various categorical and numerical features. Our findings reveal that ExioML, with its high usability, enables deep and ensemble models to achieve low mean square errors, establishing a baseline for future Machine Learning research. Through ExioML, we aim to build a foundational dataset supporting various Machine Learning applications and promote climate actions and sustainable investment decisions.
Related papers
- Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.
We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View [2.5832043241251337]
Large language models (LLMs) offer powerful capabilities but come with significant environmental costs, particularly in carbon emissions.
We introduce the concept of a functional unit (FU) and develop FUEL, the first FU-based framework for evaluating LLM's environmental impact.
Our findings highlight the potential for reducing carbon emissions by optimizing model selection, deployment strategies, and hardware choices.
arXiv Detail & Related papers (2025-02-16T20:20:18Z) - REO-VLM: Transforming VLM to Meet Regression Challenges in Earth Observation [58.91579272882073]
This paper introduces a novel benchmark dataset, called textbfREO-Instruct to unify regression and generation tasks specifically for the Earth Observation domain.
We develop textbfREO-VLM, a groundbreaking model that seamlessly integrates regression capabilities with traditional generative functions.
arXiv Detail & Related papers (2024-12-21T11:17:15Z) - Enhancing Ecological Monitoring with Multi-Objective Optimization: A Novel Dataset and Methodology for Segmentation Algorithms [17.802456388479616]
We introduce a unique semantic segmentation dataset of 6,096 high-resolution aerial images capturing indigenous and invasive grass species in Bega Valley, New South Wales, Australia.
This dataset presents a challenging task due to the overlap and distribution of grass species.
The dataset and code will be made publicly available, aiming to drive research in computer vision, machine learning, and ecological studies.
arXiv Detail & Related papers (2024-07-25T18:27:27Z) - Tree-based variational inference for Poisson log-normal models [47.82745603191512]
hierarchical trees are often used to organize entities based on proximity criteria.
Current count-data models do not leverage this structured information.
We introduce the PLN-Tree model as an extension of the PLN model for modeling hierarchical count data.
arXiv Detail & Related papers (2024-06-25T08:24:35Z) - Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference [2.553456266022126]
Machine learning (ML) has seen tremendous advancements, but its environmental footprint remains a concern.
Acknowledging the growing environmental impact of ML this paper investigates Green ML.
arXiv Detail & Related papers (2024-06-20T13:59:34Z) - GenBench: A Benchmarking Suite for Systematic Evaluation of Genomic Foundation Models [56.63218531256961]
We introduce GenBench, a benchmarking suite specifically tailored for evaluating the efficacy of Genomic Foundation Models.
GenBench offers a modular and expandable framework that encapsulates a variety of state-of-the-art methodologies.
We provide a nuanced analysis of the interplay between model architecture and dataset characteristics on task-specific performance.
arXiv Detail & Related papers (2024-06-01T08:01:05Z) - A Comparative Study of Machine Learning Algorithms for Anomaly Detection
in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability.
Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance.
However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z) - Analysis of Biomass Sustainability Indicators from a Machine Learning
Perspective [4.129067364486898]
This study proposes a robust model for biomass sustainability prediction by analyzing sustainability indicators using machine learning models.
Ten machine learning models were analyzed to estimate three biomass sustainability indicators, namely soil erosion factor, soil conditioning index, and organic matter factor.
The results showed that Random Forest was the best performing model to assess sustainability indicators.
arXiv Detail & Related papers (2023-02-02T02:31:42Z) - Distributed intelligence on the Edge-to-Cloud Continuum: A systematic
literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today.
The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.