A path in regression Random Forest looking for spatial dependence: a
taxonomy and a systematic review
- URL: http://arxiv.org/abs/2303.04693v2
- Date: Tue, 17 Oct 2023 13:12:03 GMT
- Title: A path in regression Random Forest looking for spatial dependence: a
taxonomy and a systematic review
- Authors: Luca Patelli, Michela Cameletti, Natalia Golini, Rosaria Ignaccolo
- Abstract summary: In environmental applications, phenomenon of interest may present spatial and/or temporal dependence that is not taken explicitly into account by Random Forest (RF)
We propose a taxonomy to classify strategies according to when (Pre-, In- and/or Post-processing) they try to include the spatial information into regression RF.
We provide a systematic review and classify the most recent strategies adopted to "adjust" regression RF to spatially dependent data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Random Forest (RF) is a well-known data-driven algorithm applied in several
fields thanks to its flexibility in modeling the relationship between the
response variable and the predictors, also in case of strong non-linearities.
In environmental applications, it often occurs that the phenomenon of interest
may present spatial and/or temporal dependence that is not taken explicitly
into account by RF in its standard version. In this work, we propose a taxonomy
to classify strategies according to when (Pre-, In- and/or Post-processing)
they try to include the spatial information into regression RF. Moreover, we
provide a systematic review and classify the most recent strategies adopted to
"adjust" regression RF to spatially dependent data, based on the criteria
provided by the Preferred Reporting Items for Systematic reviews and
Meta-Analysis (PRISMA). The latter consists of a reproducible methodology for
collecting and processing existing literature on a specified topic from
different sources. PRISMA starts with a query and ends with a set of scientific
documents to review: we performed an online query on the 25$^{th}$ October 2022
and, in the end, 32 documents were considered for review. The employed
methodological strategies and the application fields considered in the 32
scientific documents are described and discussed. This work falls inside the
Agriculture Impact On Italian Air (AgrImOnIA) project.
Related papers
- A Survey on Ordinal Regression: Applications, Advances and Prospects [22.108785258216837]
Ordinal regression is crucial for applications in various areas like facial age estimation, image aesthetics assessment, and even cancer staging.
In this survey, we present a comprehensive examination of advances and applications of ordinal regression.
arXiv Detail & Related papers (2025-03-02T16:10:36Z) - ECLIPSE: Contrastive Dimension Importance Estimation with Pseudo-Irrelevance Feedback for Dense Retrieval [14.72046677914345]
Recent advances in Information Retrieval have leveraged high-dimensional embedding spaces to improve the retrieval of relevant documents.
Despite these high-dimensional representations, documents relevant to a query reside on a lower-dimensional, query-dependent manifold.
We propose a novel methodology that addresses these limitations by leveraging information from both relevant and non-relevant documents.
arXiv Detail & Related papers (2024-12-19T15:45:06Z) - Random Forest Regression Feature Importance for Climate Impact Pathway Detection [0.0]
We develop a novel technique for discovering and ranking the chain of RF-temporal downstream impacts of a climate source.
We apply our method to ensembles of data generated by running two increasingly complex benchmarks.
We find that our RFR feature importance-based approach can accurately detect known pathways of impact for both test cases.
arXiv Detail & Related papers (2024-09-25T04:18:53Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [66.93260816493553]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.
With a focus on factual accuracy, we propose three novel metrics: Completeness, Hallucination, and Irrelevance.
Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift.
We propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment.
Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z) - Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint [56.74058752955209]
This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF)
We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment.
We propose efficient algorithms with finite-sample theoretical guarantees.
arXiv Detail & Related papers (2023-12-18T18:58:42Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Adaptive Principal Component Regression with Applications to Panel Data [29.295938927701396]
We provide the first time-uniform finite sample guarantees for (regularized) Principal component regression.
Our results rely on adapting tools from modern martingale concentration to the error-in-variables setting.
We show that our method empirically outperforms a baseline which does not leverage error-in-variables regression.
arXiv Detail & Related papers (2023-07-03T21:13:40Z) - Requirement Formalisation using Natural Language Processing and Machine
Learning: A Systematic Review [11.292853646607888]
We conducted a systematic literature review to outline the current state-of-the-art of NLP and ML techniques in Requirement Engineering.
We found that NLP approaches are the most common NLP techniques used for automatic RF, primary operating on structured and semi-structured data.
This study also revealed that Deep Learning (DL) technique are not widely used, instead classical ML techniques are predominant in the surveyed studies.
arXiv Detail & Related papers (2023-03-18T17:36:21Z) - A Comprehensive Survey on Source-free Domain Adaptation [69.17622123344327]
The research of Source-Free Domain Adaptation (SFDA) has drawn growing attention in recent years.
We provide a comprehensive survey of recent advances in SFDA and organize them into a unified categorization scheme.
We compare the results of more than 30 representative SFDA methods on three popular classification benchmarks.
arXiv Detail & Related papers (2023-02-23T06:32:09Z) - A Pipeline for Analysing Grant Applications [0.0]
This paper investigates whether grant schemes successfully identifies innovative project proposals, as intended.
Grant applications are peer-reviewed research proposals that include specific innovation and creativity'' (IC) scores assigned by reviewers.
We propose a model with the best performance, a Random Forest (RF) classifier over documents encoded with features.
arXiv Detail & Related papers (2022-10-30T13:43:53Z) - Deconstructing Self-Supervised Monocular Reconstruction: The Design
Decisions that Matter [63.5550818034739]
This paper presents a framework to evaluate state-of-the-art contributions to self-supervised monocular depth estimation.
It includes pretraining, backbone, architectural design choices and loss functions.
We re-implement, validate and re-evaluate 16 state-of-the-art contributions and introduce a new dataset.
arXiv Detail & Related papers (2022-08-02T14:38:53Z) - HiPaR: Hierarchical Pattern-aided Regression [71.22664057305572]
HiPaR mines hybrid rules of the form $p Rightarrow y = f(X)$ where $p$ is the characterization of a data region and $f(X)$ is a linear regression model on a variable of interest $y$.
HiPaR relies on pattern mining techniques to identify regions of the data where the target variable can be accurately explained via local linear models.
As our experiments shows, HiPaR mines fewer rules than existing pattern-based regression methods while still attaining state-of-the-art prediction performance.
arXiv Detail & Related papers (2021-02-24T15:53:17Z) - Prediction with Spatio-temporal Point Processes with Self Organizing
Decision Trees [0.0]
We introduce a novel approach to this problem.
Our approach is based on the Hawkes process, which is a non-stationary and self-exciting process.
We provide experimental results on real-life data.
arXiv Detail & Related papers (2020-03-07T20:39:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.