Introducing explainable supervised machine learning into interactive
feedback loops for statistical production system
- URL: http://arxiv.org/abs/2202.03212v1
- Date: Mon, 7 Feb 2022 14:17:06 GMT
- Title: Introducing explainable supervised machine learning into interactive
feedback loops for statistical production system
- Authors: Carlos Mougan, George Kanellos, Johannes Micheler, Jose Martinez,
Thomas Gottron
- Abstract summary: We develop an interactive feedback loop between data collected by the European Central Bank and data quality assurance performed by National Central Banks.
The feedback loop is based on a set of rule-based checks for raising exceptions, upon which the user either confirms the data or corrects an actual error.
In this paper we use the information received from this feedback loop to optimize the exceptions presented to the National Central Banks.
- Score: 0.13999481573773068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Statistical production systems cover multiple steps from the collection,
aggregation, and integration of data to tasks like data quality assurance and
dissemination. While the context of data quality assurance is one of the most
promising fields for applying machine learning, the lack of curated and labeled
training data is often a limiting factor.
The statistical production system for the Centralised Securities Database
features an interactive feedback loop between data collected by the European
Central Bank and data quality assurance performed by data quality managers at
National Central Banks. The quality assurance feedback loop is based on a set
of rule-based checks for raising exceptions, upon which the user either
confirms the data or corrects an actual error.
In this paper we use the information received from this feedback loop to
optimize the exceptions presented to the National Central Banks thereby
improving the quality of exceptions generated and the time consumed on the
system by the users authenticating those exceptions. For this approach we make
use of explainable supervised machine learning to (a) identify the types of
exceptions and (b) to prioritize which exceptions are more likely to require an
intervention or correction by the NCBs. Furthermore, we provide an explainable
AI taxonomy aiming to identify the different explainable AI needs that arose
during the project.
Related papers
- Privacy for Free: Leveraging Local Differential Privacy Perturbed Data from Multiple Services [10.822843258077997]
Local Differential Privacy (LDP) has emerged as a widely adopted privacy-preserving technique in modern data analytics.
This paper proposes a framework for collecting and aggregating data based on perturbed information from multiple services.
arXiv Detail & Related papers (2025-03-11T11:10:03Z) - Generalization Error Bounds for Learning under Censored Feedback [15.367801388932145]
Generalization error bounds from learning theory provide statistical guarantees on how well an algorithm will perform on previously unseen data.
We characterize the impacts of data non-IIDness due to censored feedback on such bounds.
We show that existing generalization error bounds fail to correctly capture the model's generalization guarantees.
arXiv Detail & Related papers (2024-04-14T13:17:32Z) - AAA: an Adaptive Mechanism for Locally Differential Private Mean Estimation [42.95927712062214]
Local differential privacy (LDP) is a strong privacy standard that has been adopted by popular software systems.
We propose the advanced adaptive additive (AAA) mechanism, which is a distribution-aware approach that addresses the average utility.
We provide rigorous privacy proofs, utility analyses, and extensive experiments comparing AAA with state-of-the-art mechanisms.
arXiv Detail & Related papers (2024-04-02T04:22:07Z) - Collect, Measure, Repeat: Reliability Factors for Responsible AI Data
Collection [8.12993269922936]
We argue that data collection for AI should be performed in a responsible manner.
We propose a Responsible AI (RAI) methodology designed to guide the data collection with a set of metrics.
arXiv Detail & Related papers (2023-08-22T18:01:27Z) - PAC-Based Formal Verification for Out-of-Distribution Data Detection [4.406331747636832]
This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs.
It is used to bound the detection error on unfamiliar instances with user-defined confidence.
arXiv Detail & Related papers (2023-04-04T07:33:02Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Desiderata for Explainable AI in statistical production systems of the
European Central Bank [0.537133760455631]
We aim to state user-centric desiderata for explainable AI reflecting common explainability needs experienced in statistical production systems of the European Central Bank.
We provide two concrete use cases from the domain of statistical data production in central banks: the detection of outliers in the Centralised Securities Database and the data-driven identification of data quality checks for the Supervisory Banking data system.
arXiv Detail & Related papers (2021-07-18T05:58:11Z) - Privacy Preservation in Federated Learning: An insightful survey from
the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning.
Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee.
This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z) - Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations.
We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z) - Super-App Behavioral Patterns in Credit Risk Models: Financial,
Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models.
Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z) - Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning.
In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data.
The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.