Related papers: Introducing explainable supervised machine learning into interactive feedback loops for statistical production system

Introducing explainable supervised machine learning into interactive feedback loops for statistical production system

URL: http://arxiv.org/abs/2202.03212v1
Date: Mon, 7 Feb 2022 14:17:06 GMT
Title: Introducing explainable supervised machine learning into interactive feedback loops for statistical production system
Authors: Carlos Mougan, George Kanellos, Johannes Micheler, Jose Martinez, Thomas Gottron
Abstract summary: We develop an interactive feedback loop between data collected by the European Central Bank and data quality assurance performed by National Central Banks. The feedback loop is based on a set of rule-based checks for raising exceptions, upon which the user either confirms the data or corrects an actual error. In this paper we use the information received from this feedback loop to optimize the exceptions presented to the National Central Banks.
Score: 0.13999481573773068
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Statistical production systems cover multiple steps from the collection, aggregation, and integration of data to tasks like data quality assurance and dissemination. While the context of data quality assurance is one of the most promising fields for applying machine learning, the lack of curated and labeled training data is often a limiting factor. The statistical production system for the Centralised Securities Database features an interactive feedback loop between data collected by the European Central Bank and data quality assurance performed by data quality managers at National Central Banks. The quality assurance feedback loop is based on a set of rule-based checks for raising exceptions, upon which the user either confirms the data or corrects an actual error. In this paper we use the information received from this feedback loop to optimize the exceptions presented to the National Central Banks thereby improving the quality of exceptions generated and the time consumed on the system by the users authenticating those exceptions. For this approach we make use of explainable supervised machine learning to (a) identify the types of exceptions and (b) to prioritize which exceptions are more likely to require an intervention or correction by the NCBs. Furthermore, we provide an explainable AI taxonomy aiming to identify the different explainable AI needs that arose during the project.

Related papers

Federated In-Context Learning: Iterative Refinement for Improved Answer Quality [62.72381208029899]
In-context learning (ICL) enables language models to generate responses without modifying their parameters by leveraging examples provided in the input.<n>We propose Federated In-Context Learning (Fed-ICL), a general framework that enhances ICL through an iterative, collaborative process.<n>Fed-ICL progressively refines responses by leveraging multi-round interactions between clients and a central server, improving answer quality without the need to transmit model parameters.
arXiv Detail & Related papers (2025-06-09T05:33:28Z)
Privacy for Free: Leveraging Local Differential Privacy Perturbed Data from Multiple Services [10.822843258077997]
Local Differential Privacy (LDP) has emerged as a widely adopted privacy-preserving technique in modern data analytics. This paper proposes a framework for collecting and aggregating data based on perturbed information from multiple services.
arXiv Detail & Related papers (2025-03-11T11:10:03Z)
KBAlign: Efficient Self Adaptation on Specific Knowledge Bases [73.34893326181046]
We present KBAlign, a self-supervised framework that enhances RAG systems through efficient model adaptation.<n>Our key insight is to leverage the model's intrinsic capabilities for knowledge alignment through two innovative mechanisms.<n> Experiments demonstrate that KBAlign can achieve 90% of the performance gain obtained through GPT-4-supervised adaptation.
arXiv Detail & Related papers (2024-11-22T08:21:03Z)
Generalization Error Bounds for Learning under Censored Feedback [15.367801388932145]
Generalization error bounds from learning theory provide statistical guarantees on how well an algorithm will perform on previously unseen data. We characterize the impacts of data non-IIDness due to censored feedback on such bounds. We show that existing generalization error bounds fail to correctly capture the model's generalization guarantees.
arXiv Detail & Related papers (2024-04-14T13:17:32Z)
AAA: an Adaptive Mechanism for Locally Differential Private Mean Estimation [42.95927712062214]
Local differential privacy (LDP) is a strong privacy standard that has been adopted by popular software systems. We propose the advanced adaptive additive (AAA) mechanism, which is a distribution-aware approach that addresses the average utility. We provide rigorous privacy proofs, utility analyses, and extensive experiments comparing AAA with state-of-the-art mechanisms.
arXiv Detail & Related papers (2024-04-02T04:22:07Z)
Collect, Measure, Repeat: Reliability Factors for Responsible AI Data Collection [8.12993269922936]
We argue that data collection for AI should be performed in a responsible manner. We propose a Responsible AI (RAI) methodology designed to guide the data collection with a set of metrics.
arXiv Detail & Related papers (2023-08-22T18:01:27Z)
PAC-Based Formal Verification for Out-of-Distribution Data Detection [4.406331747636832]
This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs. It is used to bound the detection error on unfamiliar instances with user-defined confidence.
arXiv Detail & Related papers (2023-04-04T07:33:02Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Desiderata for Explainable AI in statistical production systems of the European Central Bank [0.537133760455631]
We aim to state user-centric desiderata for explainable AI reflecting common explainability needs experienced in statistical production systems of the European Central Bank. We provide two concrete use cases from the domain of statistical data production in central banks: the detection of outliers in the Centralised Securities Database and the data-driven identification of data quality checks for the Supervisory Banking data system.
arXiv Detail & Related papers (2021-07-18T05:58:11Z)
Privacy Preservation in Federated Learning: An insightful survey from the GDPR Perspective [10.901568085406753]
Article is dedicated to surveying on the state-of-the-art privacy techniques, which can be employed in Federated learning. Recent research has demonstrated that retaining data and on computation in FL is not enough for privacy-guarantee. This is because ML model parameters exchanged between parties in an FL system, which can be exploited in some privacy attacks.
arXiv Detail & Related papers (2020-11-10T21:41:25Z)
Trustworthy AI [75.99046162669997]
Brittleness to minor adversarial changes in the input data, ability to explain the decisions, address the bias in their training data, are some of the most prominent limitations. We propose the tutorial on Trustworthy AI to address six critical issues in enhancing user and public trust in AI systems.
arXiv Detail & Related papers (2020-11-02T20:04:18Z)
Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management. We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance. We formulate a quality measure for the data set, which we refer to as $rho$-gap. We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models. Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z)
Leveraging Semi-Supervised Learning for Fairness using Neural Networks [49.604038072384995]
There has been a growing concern about the fairness of decision-making systems based on machine learning. In this paper, we propose a semi-supervised algorithm using neural networks benefiting from unlabeled data. The proposed model, called SSFair, exploits the information in the unlabeled data to mitigate the bias in the training data.
arXiv Detail & Related papers (2019-12-31T09:11:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.