Interpreting Deep Forest through Feature Contribution and MDI Feature
Importance
- URL: http://arxiv.org/abs/2305.00805v1
- Date: Mon, 1 May 2023 13:10:24 GMT
- Title: Interpreting Deep Forest through Feature Contribution and MDI Feature
Importance
- Authors: Yi-Xiao He, Shen-Huan Lyu, Yuan Jiang
- Abstract summary: Deep forest is a non-differentiable deep model which has achieved impressive empirical success across a wide variety of applications.
Many of the application fields prefer explainable models, such as random forests with feature contributions that can provide local explanation for each prediction.
We propose our feature contribution and MDI feature importance calculation tools for deep forest.
- Score: 6.475147482292634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep forest is a non-differentiable deep model which has achieved impressive
empirical success across a wide variety of applications, especially on
categorical/symbolic or mixed modeling tasks. Many of the application fields
prefer explainable models, such as random forests with feature contributions
that can provide local explanation for each prediction, and Mean Decrease
Impurity (MDI) that can provide global feature importance. However, deep
forest, as a cascade of random forests, possesses interpretability only at the
first layer. From the second layer on, many of the tree splits occur on the new
features generated by the previous layer, which makes existing explanatory
tools for random forests inapplicable. To disclose the impact of the original
features in the deep layers, we design a calculation method with an estimation
step followed by a calibration step for each layer, and propose our feature
contribution and MDI feature importance calculation tools for deep forest.
Experimental results on both simulated data and real world data verify the
effectiveness of our methods.
Related papers
- Advancing the Understanding of Fine-Grained 3D Forest Structures using Digital Cousins and Simulation-to-Reality: Methods and Datasets [15.813305272984978]
Boreal3D is the world's largest forest point cloud dataset.
It includes 1000 highly realistic and structurally diverse forest plots.
Models pre-trained on synthetic data can significantly improve performance when applied to real forest datasets.
arXiv Detail & Related papers (2025-01-07T09:12:55Z) - Towards general deep-learning-based tree instance segmentation models [0.0]
Deep-learning methods have been proposed which show the potential of learning to segment trees.
We use seven diverse datasets found in literature to gain insights into the generalization capabilities under domain-shift.
Our results suggest that a generalization from coniferous dominated sparse point clouds to deciduous dominated high-resolution point clouds is possible.
arXiv Detail & Related papers (2024-05-03T12:42:43Z) - Improve Deep Forest with Learnable Layerwise Augmentation Policy
Schedule [22.968268349995853]
This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules.
We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer.
Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors.
arXiv Detail & Related papers (2023-09-16T15:54:25Z) - ForensicsForest Family: A Series of Multi-scale Hierarchical Cascade Forests for Detecting GAN-generated Faces [53.739014757621376]
We describe a simple and effective forest-based method set called em ForensicsForest Family to detect GAN-generate faces.
ForenscisForest is a newly proposed Multi-scale Hierarchical Cascade Forest.
Hybrid ForensicsForest integrates the CNN layers into models.
Divide-and-Conquer ForensicsForest can construct a forest model using only a portion of training samplings.
arXiv Detail & Related papers (2023-08-02T06:41:19Z) - Understanding Masked Autoencoders via Hierarchical Latent Variable
Models [109.35382136147349]
Masked autoencoder (MAE) has recently achieved prominent success in a variety of vision tasks.
Despite the emergence of intriguing empirical observations on MAE, a theoretically principled understanding is still lacking.
arXiv Detail & Related papers (2023-06-08T03:00:10Z) - Neuroevolution-based Classifiers for Deforestation Detection in Tropical
Forests [62.997667081978825]
Millions of hectares of tropical forests are lost every year due to deforestation or degradation.
Monitoring and deforestation detection programs are in use, in addition to public policies for the prevention and punishment of criminals.
This paper proposes the use of pattern classifiers based on neuroevolution technique (NEAT) in tropical forest deforestation detection tasks.
arXiv Detail & Related papers (2022-08-23T16:04:12Z) - Multi-Layer Modeling of Dense Vegetation from Aerial LiDAR Scans [4.129847064263057]
We release WildForest3D, which consists of 29 study plots and over 2000 individual trees across 47 000m2 with dense 3D annotation.
We propose a 3D deep network architecture predicting for the first time both 3D point-wise labels and high-resolution occupancy meshes simultaneously.
arXiv Detail & Related papers (2022-04-25T12:47:05Z) - Manifold Topology Divergence: a Framework for Comparing Data Manifolds [109.0784952256104]
We develop a framework for comparing data manifold, aimed at the evaluation of deep generative models.
Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence)
We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance.
arXiv Detail & Related papers (2021-06-08T00:30:43Z) - Making CNNs Interpretable by Building Dynamic Sequential Decision
Forests with Top-down Hierarchy Learning [62.82046926149371]
We propose a generic model transfer scheme to make Convlutional Neural Networks (CNNs) interpretable.
We achieve this by building a differentiable decision forest on top of CNNs.
We name the transferred model deep Dynamic Sequential Decision Forest (dDSDF)
arXiv Detail & Related papers (2021-06-05T07:41:18Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Deep tree-ensembles for multi-output prediction [0.0]
We propose a novel deep tree-ensemble (DTE) model, where every layer enriches the original feature set with a representation learning component based on tree-embeddings.
We specifically focus on two structured output prediction tasks, namely multi-label classification and multi-target regression.
arXiv Detail & Related papers (2020-11-03T16:25:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.