Preferential Mixture-of-Experts: Interpretable Models that Rely on Human
Expertise as much as Possible
- URL: http://arxiv.org/abs/2101.05360v1
- Date: Wed, 13 Jan 2021 21:57:00 GMT
- Title: Preferential Mixture-of-Experts: Interpretable Models that Rely on Human
Expertise as much as Possible
- Authors: Melanie F. Pradier, Javier Zazo, Sonali Parbhoo, Roy H. Perlis,
Maurizio Zazzi, Finale Doshi-Velez
- Abstract summary: We propose Preferential MoE, a novel human-ML mixture-of-experts model.
Our model exhibits an interpretable gating function that provides information on when human rules should be followed or avoided.
We demonstrate the utility of Preferential MoE on two clinical applications for the treatment of Human Immunodeficiency Virus (HIV) and management of Major Depressive Disorder (MDD)
- Score: 29.097034423502368
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Preferential MoE, a novel human-ML mixture-of-experts model that
augments human expertise in decision making with a data-based classifier only
when necessary for predictive performance. Our model exhibits an interpretable
gating function that provides information on when human rules should be
followed or avoided. The gating function is maximized for using human-based
rules, and classification errors are minimized. We propose solving a coupled
multi-objective problem with convex subproblems. We develop approximate
algorithms and study their performance and convergence. Finally, we demonstrate
the utility of Preferential MoE on two clinical applications for the treatment
of Human Immunodeficiency Virus (HIV) and management of Major Depressive
Disorder (MDD).
Related papers
- A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models [63.949883238901414]
We present a unique angle of gradient analysis of loss functions that simultaneously reward good examples and penalize bad ones in LMs.
We find that ExMATE serves as a superior surrogate for MLE, and that combining DPO with ExMATE instead of MLE further enhances both the statistical (5-7%) and generative (+18% win rate) performance.
arXiv Detail & Related papers (2024-08-29T17:46:18Z) - The Relevance Feature and Vector Machine for health applications [0.11538034264098687]
This paper presents a novel model that addresses the challenges of the fat-data problem when dealing with clinical prospective studies.
The model capabilities are tested against state-of-the-art models in several medical datasets with fat-data problems.
arXiv Detail & Related papers (2024-02-11T01:21:56Z) - Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models [115.501751261878]
Fine-tuning language models(LMs) on human-generated data remains a prevalent practice.
We investigate whether we can go beyond human data on tasks where we have access to scalar feedback.
We find that ReST$EM$ scales favorably with model size and significantly surpasses fine-tuning only on human data.
arXiv Detail & Related papers (2023-12-11T18:17:43Z) - Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly.
L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors.
We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z) - Improving Normative Modeling for Multi-modal Neuroimaging Data using
mixture-of-product-of-experts variational autoencoders [0.0]
Existing variational autoencoder (VAE)-based normative models aggregate information from multiple modalities by estimating product or averaging of unimodal latent posteriors.
This can often lead to uninformative joint latent distributions which affects the estimation of subject-level deviations.
We adopted the Mixture-of-Product-of-Experts technique which allows better modelling of the joint latent posterior.
arXiv Detail & Related papers (2023-12-02T01:17:01Z) - Towards Better Modeling with Missing Data: A Contrastive Learning-based
Visual Analytics Perspective [7.577040836988683]
Missing data can pose a challenge for machine learning (ML) modeling.
Current approaches are categorized into feature imputation and label prediction.
This study proposes a Contrastive Learning framework to model observed data with missing values.
arXiv Detail & Related papers (2023-09-18T13:16:24Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - A comparison of approaches to improve worst-case predictive model
performance over patient subpopulations [14.175321968797252]
Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations.
We identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations.
We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures.
arXiv Detail & Related papers (2021-08-27T13:10:00Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.