Preservation of Feature Stability in Machine Learning Under Data Uncertainty for Decision Support in Critical Domains
- URL: http://arxiv.org/abs/2401.11044v2
- Date: Tue, 6 Aug 2024 11:29:06 GMT
- Title: Preservation of Feature Stability in Machine Learning Under Data Uncertainty for Decision Support in Critical Domains
- Authors: Karol CapaĆa, Paulina Tworek, Jose Sousa,
- Abstract summary: Decision-making in human activities often relies on incomplete data, even in critical domains.
This paper addresses this gap by conducting a set of experiments using traditional machine learning methods.
We found that the ML descriptive approach maintains higher classification accuracy while ensuring the stability of feature selection as data incompleteness increases.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a world where Machine Learning (ML) is increasingly deployed to support decision-making in critical domains, providing decision-makers with explainable, stable, and relevant inputs becomes fundamental. Understanding how machine learning works under missing data and how this affects feature variability is paramount. This is even more relevant as machine learning approaches focus on standardising decision-making approaches that rely on an idealised set of features. However, decision-making in human activities often relies on incomplete data, even in critical domains. This paper addresses this gap by conducting a set of experiments using traditional machine learning methods that look for optimal decisions in comparison to a recently deployed machine learning method focused on a classification that is more descriptive and mimics human decision making, allowing for the natural integration of explainability. We found that the ML descriptive approach maintains higher classification accuracy while ensuring the stability of feature selection as data incompleteness increases. This suggests that descriptive classification methods can be helpful in uncertain decision-making scenarios.
Related papers
- Decision-Focused Uncertainty Quantification [32.93992587758183]
We develop a framework based on conformal prediction to produce prediction sets that account for a downstream decision loss function.
We present a real-world use case in healthcare diagnosis, where our method effectively incorporates the hierarchical structure of dermatological diseases.
arXiv Detail & Related papers (2024-10-02T17:22:09Z) - Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution [38.53065398127086]
This study investigates the potential of feature attribution methods to filter out uninformative features in input data for regression problems.
We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space.
To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.
arXiv Detail & Related papers (2024-09-25T09:50:51Z) - Explainable Data-Driven Optimization: From Context to Decision and Back
Again [76.84947521482631]
Data-driven optimization uses contextual information and machine learning algorithms to find solutions to decision problems with uncertain parameters.
We introduce a counterfactual explanation methodology tailored to explain solutions to data-driven problems.
We demonstrate our approach by explaining key problems in operations management such as inventory management and routing.
arXiv Detail & Related papers (2023-01-24T15:25:16Z) - RISE: Robust Individualized Decision Learning with Sensitive Variables [1.5293427903448025]
A naive baseline is to ignore sensitive variables in learning decision rules, leading to significant uncertainty and bias.
We propose a decision learning framework to incorporate sensitive variables during offline training but not include them in the input of the learned decision rule during model deployment.
arXiv Detail & Related papers (2022-11-12T04:31:38Z) - Causal Fairness Analysis [68.12191782657437]
We introduce a framework for understanding, modeling, and possibly solving issues of fairness in decision-making settings.
The main insight of our approach will be to link the quantification of the disparities present on the observed data with the underlying, and often unobserved, collection of causal mechanisms.
Our effort culminates in the Fairness Map, which is the first systematic attempt to organize and explain the relationship between different criteria found in the literature.
arXiv Detail & Related papers (2022-07-23T01:06:34Z) - Learning from Heterogeneous Data Based on Social Interactions over
Graphs [58.34060409467834]
This work proposes a decentralized architecture, where individual agents aim at solving a classification problem while observing streaming features of different dimensions.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
We show that the.
strategy enables the agents to learn consistently under this highly-heterogeneous setting.
arXiv Detail & Related papers (2021-12-17T12:47:18Z) - Targeted Active Learning for Bayesian Decision-Making [15.491942513739676]
We argue that when acquiring samples sequentially, separating learning and decision-making is sub-optimal.
We introduce a novel active learning strategy which takes the down-the-line decision problem into account.
Specifically, we introduce a novel active learning criterion which maximizes the expected information gain on the posterior distribution of the optimal decision.
arXiv Detail & Related papers (2021-06-08T09:05:43Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.