Redeeming Data Science by Decision Modelling
- URL: http://arxiv.org/abs/2307.00088v1
- Date: Fri, 30 Jun 2023 19:00:04 GMT
- Title: Redeeming Data Science by Decision Modelling
- Authors: John Mark Agosta and Robert Horton
- Abstract summary: We explain how Decision Modelling combines a conventional machine learning model with an explicit value model.
To give a specific example we show how this is done by integrating a model's ROC curve with a utility model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With the explosion of applications of Data Science, the field is has come
loose from its foundations. This article argues for a new program of applied
research in areas familiar to researchers in Bayesian methods in AI that are
needed to ground the practice of Data Science by borrowing from AI techniques
for model formulation that we term ``Decision Modelling.'' This article briefly
reviews the formulation process as building a causal graphical model, then
discusses the process in terms of six principles that comprise \emph{Decision
Quality}, a framework from the popular business literature. We claim that any
successful applied ML modelling effort must include these six principles.
We explain how Decision Modelling combines a conventional machine learning
model with an explicit value model. To give a specific example we show how this
is done by integrating a model's ROC curve with a utility model.
Related papers
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - Learning-based Models for Vulnerability Detection: An Extensive Study [3.1317409221921144]
We extensively and comprehensively investigate two types of state-of-the-art learning-based approaches.
We experimentally demonstrate the priority of sequence-based models and the limited abilities of both graph-based models.
arXiv Detail & Related papers (2024-08-14T13:01:30Z) - FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction [26.26211464623954]
Federated Importance-Aware Submodel Extraction (FIARSE) is a novel approach that dynamically adjusts submodels based on the importance of model parameters.
Compared to existing works, the proposed method offers a theoretical foundation for the submodel extraction.
Extensive experiments are conducted on various datasets to showcase the superior performance of the proposed FIARSE.
arXiv Detail & Related papers (2024-07-28T04:10:11Z) - A novel data generation scheme for surrogate modelling with deep
operator networks [0.0]
We propose a novel methodology to alleviate the computational burden associated with training data generation for DeepONets.
Unlike existing literature, the proposed framework for data generation does not use any partial differential equation integration strategy.
The proposed methodology can be extended to other operator learning methods, making the approach widely applicable.
arXiv Detail & Related papers (2024-02-24T14:42:42Z) - Discovering Interpretable Physical Models using Symbolic Regression and
Discrete Exterior Calculus [55.2480439325792]
We propose a framework that combines Symbolic Regression (SR) and Discrete Exterior Calculus (DEC) for the automated discovery of physical models.
DEC provides building blocks for the discrete analogue of field theories, which are beyond the state-of-the-art applications of SR to physical problems.
We prove the effectiveness of our methodology by re-discovering three models of Continuum Physics from synthetic experimental data.
arXiv Detail & Related papers (2023-10-10T13:23:05Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z) - Model Provenance via Model DNA [23.885185988451667]
We introduce a novel concept of Model DNA which represents the unique characteristics of a machine learning model.
We develop an efficient framework for model provenance identification, which enables us to identify whether a source model is a pre-training model of a target model.
arXiv Detail & Related papers (2023-08-04T03:46:41Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z) - Kernel-Based Models for Influence Maximization on Graphs based on
Gaussian Process Variance Minimization [9.357483974291899]
We introduce and investigate a novel model for influence (IM) on graphs.
Data-driven approaches can be applied to determine proper kernels for this IM model.
Compared to models in this field that rely on costly Monte-Carlo simulations, our model allows for a simple and cost-efficient update strategy.
arXiv Detail & Related papers (2021-03-02T08:55:34Z) - Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application.
In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model.
Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.