Monotone Tree-Based GAMI Models by Adapting XGBoost
- URL: http://arxiv.org/abs/2309.02426v1
- Date: Tue, 5 Sep 2023 17:54:37 GMT
- Title: Monotone Tree-Based GAMI Models by Adapting XGBoost
- Authors: Linwei Hu, Soroush Aramideh, Jie Chen, Vijayan N. Nair
- Abstract summary: This paper considers models of the form $f(x)=sum_j,kf_j,k(x_j, x_k)$ and develops monotone tree-based GAMI models, called monotone GAMI-Tree.
It is straightforward to fit a monotone model to $f(x)$ using the options in XGBoost. However, the fitted model is still a black box.
We take a different approach: i) use a filtering technique to determine the important interactions, ii) fit a monotone XGBoost algorithm with the selected interactions, and
- Score: 4.566028525473582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent papers have used machine learning architecture to fit low-order
functional ANOVA models with main effects and second-order interactions. These
GAMI (GAM + Interaction) models are directly interpretable as the functional
main effects and interactions can be easily plotted and visualized.
Unfortunately, it is not easy to incorporate the monotonicity requirement into
the existing GAMI models based on boosted trees, such as EBM (Lou et al. 2013)
and GAMI-Lin-T (Hu et al. 2022). This paper considers models of the form
$f(x)=\sum_{j,k}f_{j,k}(x_j, x_k)$ and develops monotone tree-based GAMI
models, called monotone GAMI-Tree, by adapting the XGBoost algorithm. It is
straightforward to fit a monotone model to $f(x)$ using the options in XGBoost.
However, the fitted model is still a black box. We take a different approach:
i) use a filtering technique to determine the important interactions, ii) fit a
monotone XGBoost algorithm with the selected interactions, and finally iii)
parse and purify the results to get a monotone GAMI model. Simulated datasets
are used to demonstrate the behaviors of mono-GAMI-Tree and EBM, both of which
use piecewise constant fits. Note that the monotonicity requirement is for the
full model. Under certain situations, the main effects will also be monotone.
But, as seen in the examples, the interactions will not be monotone.
Related papers
- PanGu-$\pi$: Enhancing Language Model Architectures via Nonlinearity
Compensation [97.78045712375047]
We present a new efficient model architecture for large language models (LLMs)
We show that PanGu-$pi$-7B can achieve a comparable performance to that of benchmarks with about 10% inference speed-up.
In addition, we have deployed PanGu-$pi$-7B in the high-value domains of finance and law, developing an LLM named YunShan for practical application.
arXiv Detail & Related papers (2023-12-27T11:49:24Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - Mixtures of All Trees [28.972995038976745]
We propose a novel class of generative models called mixtures of all trees: that is, a mixture over all possible ($nn-2$) tree-shaped graphical models over $n$ variables.
We show that it is possible to parameterize this Mixture of All Trees (MoAT) model compactly in a way that allows for tractable likelihood and optimization via gradient descent.
arXiv Detail & Related papers (2023-02-27T23:37:03Z) - On marginal feature attributions of tree-based models [0.11184789007828977]
Local feature attributions based on marginal expectations, e.g. marginal Shapley, Owen or Banzhaf values, may be employed.
We present two (statistically similar) decision trees that compute the exact same function for which the "path-dependent" TreeSHAP yields different rankings of features.
We exploit the symmetry to derive an explicit formula, with improved complexity and only in terms of the internal model parameters, for marginal Shapley (and Banzhaf and Owen) values of CatBoost models.
arXiv Detail & Related papers (2023-02-16T17:18:03Z) - Artificial Benchmark for Community Detection with Outliers (ABCD+o) [5.8010446129208155]
We extend the ABCD model to include potential outliers.
We perform some exploratory experiments on both the new ABCD+o model as well as a real-world network to show that outliers possess some desired, distinguishable properties.
arXiv Detail & Related papers (2023-01-13T20:14:44Z) - Using Model-Based Trees with Boosting to Fit Low-Order Functional ANOVA
Models [5.131758478675364]
Low-order functional ANOVA models have been rediscovered in the machine learning (ML) community under the guise of inherently interpretable machine learning.
We propose a new algorithm, called GAMI-Tree, that is similar to EBM, but has a number of features that lead to better performance.
We use simulated and real datasets to compare the performance and interpretability of GAMI-Tree with EBM and GAMI-Net.
arXiv Detail & Related papers (2022-07-14T14:23:14Z) - On the Generative Utility of Cyclic Conditionals [103.1624347008042]
We study whether and how can we model a joint distribution $p(x,z)$ using two conditional models $p(x|z)$ that form a cycle.
We propose the CyGen framework for cyclic-conditional generative modeling, including methods to enforce compatibility and use the determined distribution to fit and generate data.
arXiv Detail & Related papers (2021-06-30T10:23:45Z) - GraphFM: Graph Factorization Machines for Feature Interaction Modeling [27.307086868266012]
We propose a novel approach, Graph Factorization Machine (GraphFM), by naturally representing features in the graph structure.
In particular, we design a mechanism to select the beneficial feature interactions and formulate them as edges between features.
The proposed model integrates the interaction function of FM into the feature aggregation strategy of Graph Neural Network (GNN)
arXiv Detail & Related papers (2021-05-25T12:10:54Z) - On the Discrepancy between Density Estimation and Sequence Generation [92.70116082182076]
log-likelihood is highly correlated with BLEU when we consider models within the same family.
We observe no correlation between rankings of models across different families.
arXiv Detail & Related papers (2020-02-17T20:13:35Z) - Revisiting Graph based Collaborative Filtering: A Linear Residual Graph
Convolutional Network Approach [55.44107800525776]
Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models.
In this paper, we revisit GCN based Collaborative Filtering (CF) based Recommender Systems (RS)
We show that removing non-linearities would enhance recommendation performance, consistent with the theories in simple graph convolutional networks.
We propose a residual network structure that is specifically designed for CF with user-item interaction modeling.
arXiv Detail & Related papers (2020-01-28T04:41:25Z) - Particle-Gibbs Sampling For Bayesian Feature Allocation Models [77.57285768500225]
Most widely used MCMC strategies rely on an element wise Gibbs update of the feature allocation matrix.
We have developed a Gibbs sampler that can update an entire row of the feature allocation matrix in a single move.
This sampler is impractical for models with a large number of features as the computational complexity scales exponentially in the number of features.
We develop a Particle Gibbs sampler that targets the same distribution as the row wise Gibbs updates, but has computational complexity that only grows linearly in the number of features.
arXiv Detail & Related papers (2020-01-25T22:11:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.