Random Forest Regression Feature Importance for Climate Impact Pathway Detection
- URL: http://arxiv.org/abs/2409.16609v1
- Date: Wed, 25 Sep 2024 04:18:53 GMT
- Title: Random Forest Regression Feature Importance for Climate Impact Pathway Detection
- Authors: Meredith G. L. Brown, Matt Peterson, Irina Tezaur, Kara Peterson, Diana Bull,
- Abstract summary: We develop a novel technique for discovering and ranking the chain of RF-temporal downstream impacts of a climate source.
We apply our method to ensembles of data generated by running two increasingly complex benchmarks.
We find that our RFR feature importance-based approach can accurately detect known pathways of impact for both test cases.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Disturbances to the climate system, both natural and anthropogenic, have far reaching impacts that are not always easy to identify or quantify using traditional climate science analyses or causal modeling techniques. In this paper, we develop a novel technique for discovering and ranking the chain of spatio-temporal downstream impacts of a climate source, referred to herein as a source-impact pathway, using Random Forest Regression (RFR) and SHapley Additive exPlanation (SHAP) feature importances. Rather than utilizing RFR for classification or regression tasks (the most common use case for RFR), we propose a fundamentally new RFR-based workflow in which we: (i) train random forest (RF) regressors on a set of spatio-temporal features of interest, (ii) calculate their pair-wise feature importances using the SHAP weights associated with those features, and (iii) translate these feature importances into a weighted pathway network (i.e., a weighted directed graph), which can be used to trace out and rank interdependencies between climate features and/or modalities. We adopt a tiered verification approach to verify our new pathway identification methodology. In this approach, we apply our method to ensembles of data generated by running two increasingly complex benchmarks: (i) a set of synthetic coupled equations, and (ii) a fully coupled simulation of the 1991 eruption of Mount Pinatubo in the Philippines performed using a modified version 2 of the U.S. Department of Energy's Energy Exascale Earth System Model (E3SMv2). We find that our RFR feature importance-based approach can accurately detect known pathways of impact for both test cases.
Related papers
- Spatio-temporal Multivariate Cluster Evolution Analysis for Detecting and Tracking Climate Impacts [0.0]
This paper presents a novel and efficient unsupervised data-driven approach for detecting statistically-significant impacts.
We demonstrate that the proposed approach is capable of detecting known post-eruption impacts/events.
We additionally describe a methodology for extracting meaningful sequences of post-eruption impacts/events by using NLP.
arXiv Detail & Related papers (2024-10-21T22:13:09Z) - Applications of machine learning to predict seasonal precipitation for East Africa [0.0]
Large-scale climate variability is linked to local or regional temperature or precipitation in a linear or non-linear fashion.
This paper investigates the use of interpretable ML methods to predict seasonal precipitation for East Africa in an operational setting.
arXiv Detail & Related papers (2024-09-10T06:16:03Z) - SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management.
Traditional methods often struggle to capture the complex dynamics of meteorological systems.
We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Characterizing climate pathways using feature importance on echo state
networks [0.0]
echo state network (ESN) is a computationally efficient neural network variation designed for temporal data.
ESNs are non-interpretable black-box models, which poses a hurdle for understanding variable relationships.
We conduct a simulation study to assess and compare the feature importance techniques, and we demonstrate the approach on reanalysis climate data.
arXiv Detail & Related papers (2023-10-12T16:55:04Z) - Physics Symbolic Learner for Discovering Ground-Motion Models Via
NGA-West2 Database [4.059252581613122]
Ground-motion model (GMM) is the basis of many earthquake engineering studies.
In this study, a novel physics-informed symbolic learner (PISL) method is proposed to automatically discover mathematical equation operators as symbols.
arXiv Detail & Related papers (2023-03-23T04:14:05Z) - A Notion of Feature Importance by Decorrelation and Detection of Trends
by Random Forest Regression [1.675857332621569]
We introduce a novel notion of feature importance based on the well-studied Gram-Schmidt decorrelation method.
We propose two estimators for identifying trends in the data using random forest regression.
arXiv Detail & Related papers (2023-03-02T11:01:49Z) - Spatiotemporal modeling of European paleoclimate using doubly sparse
Gaussian processes [61.31361524229248]
We build on recent scale sparsetemporal GPs to reduce the computational burden.
We successfully employ such a doubly sparse GP to construct a probabilistic model of paleoclimate.
arXiv Detail & Related papers (2022-11-15T14:15:04Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Recent Developments Combining Ensemble Smoother and Deep Generative
Networks for Facies History Matching [58.720142291102135]
This research project focuses on the use of autoencoders networks to construct a continuous parameterization for facies models.
We benchmark seven different formulations, including VAE, generative adversarial network (GAN), Wasserstein GAN, variational auto-encoding GAN, principal component analysis (PCA) with cycle GAN, PCA with transfer style network and VAE with style loss.
arXiv Detail & Related papers (2020-05-08T21:32:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.