Lifting Interpretability-Performance Trade-off via Automated Feature
Engineering
- URL: http://arxiv.org/abs/2002.04267v1
- Date: Tue, 11 Feb 2020 09:16:45 GMT
- Title: Lifting Interpretability-Performance Trade-off via Automated Feature
Engineering
- Authors: Alicja Gosiewska and Przemyslaw Biecek
- Abstract summary: Complex black-box predictive models may have high performance, but lack of interpretability causes problems.
We propose a method that uses elastic black-boxes as surrogate models to create a simpler, less opaque, yet still accurate and interpretable glass-box models.
- Score: 5.802346990263708
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Complex black-box predictive models may have high performance, but lack of
interpretability causes problems like lack of trust, lack of stability,
sensitivity to concept drift. On the other hand, achieving satisfactory
accuracy of interpretable models require more time-consuming work related to
feature engineering. Can we train interpretable and accurate models, without
timeless feature engineering? We propose a method that uses elastic black-boxes
as surrogate models to create a simpler, less opaque, yet still accurate and
interpretable glass-box models. New models are created on newly engineered
features extracted with the help of a surrogate model. We supply the analysis
by a large-scale benchmark on several tabular data sets from the OpenML
database. There are two results 1) extracting information from complex models
may improve the performance of linear models, 2) questioning a common myth that
complex machine learning models outperform linear models.
Related papers
- Are Linear Regression Models White Box and Interpretable? [0.0]
Explainable artificial intelligence (XAI) is a set of tools and algorithms that applied or embedded to machine learning models to understand and interpret the models.
Simple models including linear regression are easy to implement, has less computational complexity and easy to visualize the output.
arXiv Detail & Related papers (2024-07-16T21:05:51Z) - fairml: A Statistician's Take on Fair Machine Learning Modelling [0.0]
We describe the fairml package which implements our previous work (Scutari, Panero, and Proissl 2022) and related models in the literature.
fairml is designed around classical statistical models and penalised regression results.
The constraint used to enforce fairness is to model estimation, making it possible to mix-and-match the desired model family and fairness definition for each application.
arXiv Detail & Related papers (2023-05-03T09:59:53Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - LocalGLMnet: interpretable deep learning for tabular data [0.0]
We propose a new network architecture that shares similar features as generalized linear models.
Our approach provides an additive decomposition in the spirit of Shapley values and integrated gradients.
arXiv Detail & Related papers (2021-07-23T07:38:33Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Sufficiently Accurate Model Learning for Planning [119.80502738709937]
This paper introduces the constrained Sufficiently Accurate model learning approach.
It provides examples of such problems, and presents a theorem on how close some approximate solutions can be.
The approximate solution quality will depend on the function parameterization, loss and constraint function smoothness, and the number of samples in model learning.
arXiv Detail & Related papers (2021-02-11T16:27:31Z) - Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability.
We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code.
We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z) - A Simple and Interpretable Predictive Model for Healthcare [0.0]
Deep learning models are currently dominating most state-of-the-art solutions for disease prediction.
These deep learning models, with trainable parameters running into millions, require huge amounts of compute and data to train and deploy.
We develop a simpler yet interpretable non-deep learning based model for application to EHR data.
arXiv Detail & Related papers (2020-07-27T08:13:37Z) - What shapes feature representations? Exploring datasets, architectures,
and training [14.794135558227682]
In naturalistic learning problems, a model's input contains a wide range of features, some useful for the task at hand, and others not.
These questions are important for understanding the basis of models' decisions.
We study these questions using synthetic datasets in which the task-relevance of input features can be controlled directly.
arXiv Detail & Related papers (2020-06-22T17:02:25Z) - Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models.
We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.