Related papers: coverforest: Conformal Predictions with Random Forest in Python

coverforest: Conformal Predictions with Random Forest in Python

URL: http://arxiv.org/abs/2501.14570v1
Date: Fri, 24 Jan 2025 15:24:37 GMT
Title: coverforest: Conformal Predictions with Random Forest in Python
Authors: Panisara Meehinkong, Donlapark Ponnoprat,
Abstract summary: coverforest is a Python package that implements efficient conformal prediction methods specifically optimized for random forests.<n>Our experiments demonstrate that coverforest's predictions achieve the desired level of coverage.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Conformal prediction provides a framework for uncertainty quantification, specifically in the forms of prediction intervals and sets with distribution-free guaranteed coverage. While recent cross-conformal techniques such as CV+ and Jackknife+-after-bootstrap achieve better data efficiency than traditional split conformal methods, they incur substantial computational costs due to required pairwise comparisons between training and test samples' out-of-bag scores. Observing that these methods naturally extend from ensemble models, particularly random forests, we leverage existing optimized random forest implementations to enable efficient cross-conformal predictions. We present coverforest, a Python package that implements efficient conformal prediction methods specifically optimized for random forests. coverforest supports both regression and classification tasks through various conformal prediction methods, including split conformal, CV+, Jackknife+-after-bootstrap, and adaptive prediction sets. Our package leverages parallel computing and Cython optimizations to speed up out-of-bag calculations. Our experiments demonstrate that coverforest's predictions achieve the desired level of coverage. In addition, its training and prediction times can be faster than an existing implementation by 2--9 times. The source code for the coverforest is hosted on GitHub at https://github.com/donlapark/coverforest.

Related papers

Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift [4.13592995550836]
We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence. The leaf-wise predictions for each decision tree making up clustered random forests takes the form of a weighted least squares estimator. Clustered random forests are shown for certain tree splitting criteria to be minimax rate optimal for pointwise conditional mean estimation.
arXiv Detail & Related papers (2025-03-16T20:07:23Z)
Conformal Prediction Sets with Improved Conditional Coverage using Trust Scores [52.92618442300405]
It is impossible to achieve exact, distribution-free conditional coverage in finite samples.<n>We propose an alternative conformal prediction algorithm that targets coverage where it matters most.
arXiv Detail & Related papers (2025-01-17T12:01:56Z)
Multi-model Ensemble Conformal Prediction in Dynamic Environments [14.188004615463742]
We introduce a novel adaptive conformal prediction framework, where the model used for creating prediction sets is selected on the fly from multiple candidate models. The proposed algorithm is proven to achieve strongly adaptive regret over all intervals while maintaining valid coverage.
arXiv Detail & Related papers (2024-11-06T05:57:28Z)
Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables. We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure. We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z)
Regression Trees for Fast and Adaptive Prediction Intervals [2.6763498831034043]
We present a family of methods to calibrate prediction intervals for regression problems with local coverage guarantees. We create a partition by training regression trees and Random Forests on conformity scores. Our proposal is versatile, as it applies to various conformity scores and prediction settings.
arXiv Detail & Related papers (2024-02-12T01:17:09Z)
FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding. Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z)
UTOPIA: Universally Trainable Optimal Prediction Intervals Aggregation [9.387706860375461]
We introduce a novel strategy called Universally Trainable Optimal Predictive Intervals Aggregation (UTOPIA) This technique excels in efficiently aggregating multiple prediction intervals while maintaining a small average width of the prediction band and ensuring coverage. It is validated through its application to synthetic data and two real-world datasets in finance and macroeconomics.
arXiv Detail & Related papers (2023-06-28T20:38:37Z)
Efficient and Differentiable Conformal Prediction with General Function Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters. We show that it achieves approximate valid population coverage and near-optimal efficiency within class. Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z)
RFpredInterval: An R Package for Prediction Intervals with Random Forests and Boosted Forests [0.0]
We have developed a comprehensive R package, RFpredInterval, that integrates 16 methods to build prediction intervals with random forests and boosted forests. The methods implemented in the package are a new method to build prediction intervals with boosted forests (PIBF) and 15 different variants to produce prediction intervals with random forests proposed by Roy and Larocque ( 2020) The results show that the proposed method is very competitive and, globally, it outperforms the competing methods.
arXiv Detail & Related papers (2021-06-15T15:27:50Z)
Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers [59.06169363181417]
Predict then Interpolate (PI) is an algorithm for learning correlations that are stable across environments. We prove that by interpolating the distributions of the correct predictions and the wrong predictions, we can uncover an oracle distribution where the unstable correlation vanishes.
arXiv Detail & Related papers (2021-05-26T15:37:48Z)
Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm. We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.