Bayesian Stress Testing of Models in a Classification Hierarchy
- URL: http://arxiv.org/abs/2005.12327v1
- Date: Mon, 25 May 2020 18:22:07 GMT
- Title: Bayesian Stress Testing of Models in a Classification Hierarchy
- Authors: Bashar Awwad Shiekh Hasan and Kate Kelly
- Abstract summary: Building a machine learning solution in real-life applications often involves the decomposition of the problem into multiple models of various complexity.
We propose a Bayesian framework to model the interaction amongst models in such a hierarchy.
We show that the framework can facilitate stress testing of the overall solution, giving more confidence in its expected performance prior to active deployment.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building a machine learning solution in real-life applications often involves
the decomposition of the problem into multiple models of various complexity.
This has advantages in terms of overall performance, better interpretability of
the outcomes, and easier model maintenance. In this work we propose a Bayesian
framework to model the interaction amongst models in such a hierarchy. We show
that the framework can facilitate stress testing of the overall solution,
giving more confidence in its expected performance prior to active deployment.
Finally, we test the proposed framework on a toy problem and financial fraud
detection dataset to demonstrate how it can be applied for any machine learning
based solution, regardless of the underlying modelling required.
Related papers
- On the KL-Divergence-based Robust Satisficing Model [2.425685918104288]
robustness satisficing framework has attracted increasing attention from academia.
We present analytical interpretations, diverse performance guarantees, efficient and stable numerical methods, convergence analysis, and an extension tailored for hierarchical data structures.
We demonstrate the superior performance of our model compared to state-of-the-art benchmarks.
arXiv Detail & Related papers (2024-08-17T10:05:05Z) - Two-Stage Surrogate Modeling for Data-Driven Design Optimization with
Application to Composite Microstructure Generation [1.912429179274357]
This paper introduces a novel two-stage machine learning-based surrogate modeling framework to address inverse problems in scientific and engineering fields.
In the first stage, a machine learning model termed the "learner" identifies a limited set of candidates within the input design space whose predicted outputs closely align with desired outcomes.
In the second stage, a separate surrogate model, functioning as an "evaluator," is employed to assess the reduced candidate space generated in the first stage.
arXiv Detail & Related papers (2024-01-04T00:25:12Z) - TSPP: A Unified Benchmarking Tool for Time-series Forecasting [3.5415344166235534]
We propose a unified benchmarking framework that exposes the crucial modelling and machine learning decisions involved in developing time series forecasting models.
This framework fosters seamless integration of models and datasets, aiding both practitioners and researchers in their development efforts.
We benchmark recently proposed models within this framework, demonstrating that carefully implemented deep learning models with minimal effort can rival gradient-boosting decision trees.
arXiv Detail & Related papers (2023-12-28T16:23:58Z) - Leveraging World Model Disentanglement in Value-Based Multi-Agent
Reinforcement Learning [18.651307543537655]
We propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentangled World Model.
We present experimental results in Easy, Hard, and Super-Hard StarCraft II micro-management challenges to demonstrate that our method achieves high sample efficiency and exhibits superior performance in defeating the enemy armies compared to other baselines.
arXiv Detail & Related papers (2023-09-08T22:12:43Z) - GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision.
We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z) - Minimal Value-Equivalent Partial Models for Scalable and Robust Planning
in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment.
We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios.
We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z) - A Novel Plug-and-Play Approach for Adversarially Robust Generalization [26.29269757430314]
We propose a robust framework that employs adversarially robust training to safeguard the machine learning models against perturbed testing data.
We achieve this by incorporating the worst-case additive adversarial error within a fixed budget for each sample during model estimation.
arXiv Detail & Related papers (2022-08-19T17:02:55Z) - Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank.
Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z) - Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces.
It uses latent variables to model generalizable learning patterns.
At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.