Related papers: Bayesian Stress Testing of Models in a Classification Hierarchy

Bayesian Stress Testing of Models in a Classification Hierarchy

URL: http://arxiv.org/abs/2005.12327v1
Date: Mon, 25 May 2020 18:22:07 GMT
Title: Bayesian Stress Testing of Models in a Classification Hierarchy
Authors: Bashar Awwad Shiekh Hasan and Kate Kelly
Abstract summary: Building a machine learning solution in real-life applications often involves the decomposition of the problem into multiple models of various complexity. We propose a Bayesian framework to model the interaction amongst models in such a hierarchy. We show that the framework can facilitate stress testing of the overall solution, giving more confidence in its expected performance prior to active deployment.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Building a machine learning solution in real-life applications often involves the decomposition of the problem into multiple models of various complexity. This has advantages in terms of overall performance, better interpretability of the outcomes, and easier model maintenance. In this work we propose a Bayesian framework to model the interaction amongst models in such a hierarchy. We show that the framework can facilitate stress testing of the overall solution, giving more confidence in its expected performance prior to active deployment. Finally, we test the proposed framework on a toy problem and financial fraud detection dataset to demonstrate how it can be applied for any machine learning based solution, regardless of the underlying modelling required.

Related papers

Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage. Models may behave unreliably due to poorly explored failure modes. causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
On the KL-Divergence-based Robust Satisficing Model [2.425685918104288]
robustness satisficing framework has attracted increasing attention from academia. We present analytical interpretations, diverse performance guarantees, efficient and stable numerical methods, convergence analysis, and an extension tailored for hierarchical data structures. We demonstrate the superior performance of our model compared to state-of-the-art benchmarks.
arXiv Detail & Related papers (2024-08-17T10:05:05Z)
Two-Stage Surrogate Modeling for Data-Driven Design Optimization with Application to Composite Microstructure Generation [1.912429179274357]
This paper introduces a novel two-stage machine learning-based surrogate modeling framework to address inverse problems in scientific and engineering fields. In the first stage, a machine learning model termed the "learner" identifies a limited set of candidates within the input design space whose predicted outputs closely align with desired outcomes. In the second stage, a separate surrogate model, functioning as an "evaluator," is employed to assess the reduced candidate space generated in the first stage.
arXiv Detail & Related papers (2024-01-04T00:25:12Z)
TSPP: A Unified Benchmarking Tool for Time-series Forecasting [3.5415344166235534]
We propose a unified benchmarking framework that exposes the crucial modelling and machine learning decisions involved in developing time series forecasting models. This framework fosters seamless integration of models and datasets, aiding both practitioners and researchers in their development efforts. We benchmark recently proposed models within this framework, demonstrating that carefully implemented deep learning models with minimal effort can rival gradient-boosting decision trees.
arXiv Detail & Related papers (2023-12-28T16:23:58Z)
Leveraging World Model Disentanglement in Value-Based Multi-Agent Reinforcement Learning [18.651307543537655]
We propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentangled World Model. We present experimental results in Easy, Hard, and Super-Hard StarCraft II micro-management challenges to demonstrate that our method achieves high sample efficiency and exhibits superior performance in defeating the enemy armies compared to other baselines.
arXiv Detail & Related papers (2023-09-08T22:12:43Z)
GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z)
Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment. We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios. We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z)
Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank. Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z)
A general framework for modeling and dynamic simulation of multibody systems using factor graphs [0.8701566919381223]
We present a novel general framework grounded in the factor graph theory to solve kinematic and dynamic problems for multi-body systems. We describe how to build factor graphs to model and simulate multibody systems using both, independent and dependent coordinates. The proposed framework has been tested in extensive simulations and validated against a commercial multibody software.
arXiv Detail & Related papers (2021-01-08T06:45:45Z)
Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces. It uses latent variables to model generalizable learning patterns. At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Learning Diverse Representations for Fast Adaptation to Distribution Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.