Toward Transparent Sequence Models with Model-Based Tree Markov Model
- URL: http://arxiv.org/abs/2307.15367v2
- Date: Mon, 30 Oct 2023 03:10:26 GMT
- Title: Toward Transparent Sequence Models with Model-Based Tree Markov Model
- Authors: Chan Hsu, Wei-Chun Huang, Jun-Ting Wu, Chih-Yuan Li, Yihuang Kang
- Abstract summary: We introduce the Model-Based tree Hidden Semi-Markov Model (MOB-HSMM), an inherently interpretable model aimed at detecting high mortality risk events.
This model leverages knowledge distilled from Deep Neural Networks (DNN) to enhance predictive performance while offering clear explanations.
- Score: 0.46873264197900916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this study, we address the interpretability issue in complex, black-box
Machine Learning models applied to sequence data. We introduce the Model-Based
tree Hidden Semi-Markov Model (MOB-HSMM), an inherently interpretable model
aimed at detecting high mortality risk events and discovering hidden patterns
associated with the mortality risk in Intensive Care Units (ICU). This model
leverages knowledge distilled from Deep Neural Networks (DNN) to enhance
predictive performance while offering clear explanations. Our experimental
results indicate the improved performance of Model-Based trees (MOB trees) via
employing LSTM for learning sequential patterns, which are then transferred to
MOB trees. Integrating MOB trees with the Hidden Semi-Markov Model (HSMM) in
the MOB-HSMM enables uncovering potential and explainable sequences using
available information.
Related papers
- HMM-LSTM Fusion Model for Economic Forecasting [0.0]
This paper explores the application of Hidden Markov Models (HMM) and Long Short-Term Memory (LSTM) neural networks for economic forecasting.
The study explores a new approach that integrates HMM-derived hidden states and means as additional features for LSTM modeling.
arXiv Detail & Related papers (2025-01-01T17:31:36Z) - Malware Classification using a Hybrid Hidden Markov Model-Convolutional Neural Network [1.2289361708127875]
We present a novel approach based on a hybrid architecture combining features extracted using a Hidden Markov Model (HMM) with a Convolutional Neural Network (CNN)
We demonstrate the effectiveness of our approach on the popular Malicia dataset, and we obtain superior performance, as compared to other machine learning methods.
arXiv Detail & Related papers (2024-12-25T15:34:57Z) - Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - SynthTree: Co-supervised Local Model Synthesis for Explainable Prediction [15.832975722301011]
We propose a novel method to enhance explainability with minimal accuracy loss.
We have developed novel methods for estimating nodes by leveraging AI techniques.
Our findings highlight the critical role that statistical methodologies can play in advancing explainable AI.
arXiv Detail & Related papers (2024-06-16T14:43:01Z) - State Space Models as Foundation Models: A Control Theoretic Overview [3.3222241150972356]
In recent years, there has been a growing interest in integrating linear state-space models (SSM) in deep neural network architectures.
This paper is intended as a gentle introduction to SSM-based architectures for control theorists.
It provides a systematic review of the most successful SSM proposals and highlights their main features from a control theoretic perspective.
arXiv Detail & Related papers (2024-03-25T16:10:47Z) - Leveraging Model-based Trees as Interpretable Surrogate Models for Model
Distillation [3.5437916561263694]
Surrogate models play a crucial role in retrospectively interpreting complex and powerful black box machine learning models.
This paper focuses on using model-based trees as surrogate models which partition the feature space into interpretable regions via decision rules.
Four model-based tree algorithms, namely SLIM, GUIDE, MOB, and CTree, are compared regarding their ability to generate such surrogate models.
arXiv Detail & Related papers (2023-10-04T19:06:52Z) - Sparse Modular Activation for Efficient Sequence Modeling [94.11125833685583]
Recent models combining Linear State Space Models with self-attention mechanisms have demonstrated impressive results across a range of sequence modeling tasks.
Current approaches apply attention modules statically and uniformly to all elements in the input sequences, leading to sub-optimal quality-efficiency trade-offs.
We introduce Sparse Modular Activation (SMA), a general mechanism enabling neural networks to sparsely activate sub-modules for sequence elements in a differentiable manner.
arXiv Detail & Related papers (2023-06-19T23:10:02Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Scaling Hidden Markov Language Models [118.55908381553056]
This work revisits the challenge of scaling HMMs to language modeling datasets.
We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization.
arXiv Detail & Related papers (2020-11-09T18:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.