Related papers: Rethinking Log Odds: Linear Probability Modelling and Expert Advice in Interpretable Machine Learning

Rethinking Log Odds: Linear Probability Modelling and Expert Advice in Interpretable Machine Learning

URL: http://arxiv.org/abs/2211.06360v1
Date: Fri, 11 Nov 2022 17:21:57 GMT
Title: Rethinking Log Odds: Linear Probability Modelling and Expert Advice in Interpretable Machine Learning
Authors: Danial Dervovic and Nicolas Marchesotti and Freddy Lecue and Daniele Magazzeni
Abstract summary: We introduce a family of interpretable machine learning models, with two broad additions: Linearised Additive Models (LAMs) and SubscaleHedge. LAMs replace the ubiquitous logistic link function in General Additive Models (GAMs); and SubscaleHedge is an expert advice algorithm for combining base models trained on subsets of features called subscales.
Score: 8.831954614241234
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a family of interpretable machine learning models, with two broad additions: Linearised Additive Models (LAMs) which replace the ubiquitous logistic link function in General Additive Models (GAMs); and SubscaleHedge, an expert advice algorithm for combining base models trained on subsets of features called subscales. LAMs can augment any additive binary classification model equipped with a sigmoid link function. Moreover, they afford direct global and local attributions of additive components to the model output in probability space. We argue that LAMs and SubscaleHedge improve the interpretability of their base algorithms. Using rigorous null-hypothesis significance testing on a broad suite of financial modelling data, we show that our algorithms do not suffer from large performance penalties in terms of ROC-AUC and calibration.

Related papers

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Wasserstein proximal operators describe score-based generative models and resolve memorization [12.321631823103894]
We first formulate SGMs in terms of the Wasserstein proximal operator (WPO) We show that WPO describes the inductive bias of diffusion and score-based models. We present an interpretable kernel-based model for the score function which dramatically improves the performance of SGMs.
arXiv Detail & Related papers (2024-02-09T03:33:13Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
A Hybrid of Generative and Discriminative Models Based on the Gaussian-coupled Softmax Layer [5.33024001730262]
We propose a method to train a hybrid of discriminative and generative models in a single neural network. We demonstrate that the proposed hybrid model can be applied to semi-supervised learning and confidence calibration.
arXiv Detail & Related papers (2023-05-10T05:48:22Z)
How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE) We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z)
GAM(e) changer or not? An evaluation of interpretable machine learning models based on additive model constraints [5.783415024516947]
This paper investigates a series of intrinsically interpretable machine learning models. We evaluate the prediction qualities of five GAMs as compared to six traditional ML models.
arXiv Detail & Related papers (2022-04-19T20:37:31Z)
Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models. Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z)
Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability. We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code. We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z)
Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms [8.949704905866888]
Supervised Machine Learning algorithms have become popular in recent years due to their superior predictive performance over traditional statistical methods. The main focus is on interpretability, the resulting surrogate model also has reasonably good predictive performance.
arXiv Detail & Related papers (2020-07-28T23:46:16Z)
Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets. Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z)
Interpretable Learning-to-Rank with Generalized Additive Models [78.42800966500374]
Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area. Recent progress on interpretable ranking models largely focuses on generating post-hoc explanations for existing black-box ranking models. We lay the groundwork for intrinsically interpretable learning-to-rank by introducing generalized additive models (GAMs) into ranking tasks.
arXiv Detail & Related papers (2020-05-06T01:51:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.