Related papers: From Tea Leaves to System Maps: Context-awareness in Monitoring Operational Machine Learning Models

From Tea Leaves to System Maps: Context-awareness in Monitoring Operational Machine Learning Models

URL: http://arxiv.org/abs/2506.10770v2
Date: Mon, 30 Jun 2025 07:43:41 GMT
Title: From Tea Leaves to System Maps: Context-awareness in Monitoring Operational Machine Learning Models
Authors: Joran Leest, Claudia Raibulet, Patricia Lago, Ilias Gerostathopoulos,
Abstract summary: This paper presents a systematic review to characterize and structure the various types of contextual information in this domain.<n>We introduce the Contextual System--Aspect--Representation (C-SAR) framework, a conceptual model that synthesizes our findings.<n>We also identify 20 recurring and potentially reusable patterns of specific system, aspect, and representation triplets, and map them to the monitoring activities they support.
Score: 10.17792666432021
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning (ML) models in production do not fail due to statistical anomalies in their input data; they fail due to contextual misalignment -- when their environment deviates from training assumptions, leading to unreliable predictions. Effective ML monitoring requires rich contextual information to move beyond detecting statistical shifts toward meaningful alerts and systematic root-cause analysis. Surprisingly, despite extensive research in ML monitoring and related areas (drift detection, data validation, out-of-distribution detection), there is no shared understanding of how to use contextual information -- a striking gap, given that monitoring fundamentally involves interpreting information in context. In response, this paper presents a systematic review to characterize and structure the various types of contextual information in this domain. Our analysis examines 94 primary studies across data mining, databases, software engineering, and ML. We introduce the Contextual System--Aspect--Representation (C-SAR) framework, a conceptual model that synthesizes our findings. We also identify 20 recurring and potentially reusable patterns of specific system, aspect, and representation triplets, and map them to the monitoring activities they support. This study provides a new perspective on ML monitoring: from interpreting ``tea leaves'' (i.e., isolated data and performance statistics) to constructing and managing ``system maps'' (i.e., end-to-end views that connect data, models, and operating context). This way, we aim to enable systematic ML monitoring practices.

Related papers

How do Large Language Models Understand Relevance? A Mechanistic Interpretability Perspective [64.00022624183781]
Large language models (LLMs) can assess relevance and support information retrieval (IR) tasks.<n>We investigate how different LLM modules contribute to relevance judgment through the lens of mechanistic interpretability.
arXiv Detail & Related papers (2025-04-10T16:14:55Z)
Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.<n>We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z)
Out-of-Distribution Detection on Graphs: A Survey [58.47395497985277]
Graph out-of-distribution (GOOD) detection focuses on identifying graph data that deviates from the distribution seen during training.<n>We categorize existing methods into four types: enhancement-based, reconstruction-based, information propagation-based, and classification-based approaches.<n>We discuss practical applications and theoretical foundations, highlighting the unique challenges posed by graph data.
arXiv Detail & Related papers (2025-02-12T04:07:12Z)
Model Monitoring in the Absence of Labeled Data via Feature Attributions Distributions [5.167069404528051]
This thesis explores machine learning model monitoring ML before the predictions impact real-world decisions or users.<n>The thesis is structured around two main themes: (i) AI alignment, measuring if AI models behave in a manner consistent with human values and (ii) performance monitoring, measuring if the models achieve specific accuracy goals or desires.
arXiv Detail & Related papers (2025-01-18T14:07:37Z)
A Review of Physics-Informed Machine Learning Methods with Applications to Condition Monitoring and Anomaly Detection [1.124958340749622]
PIML is the incorporation of known physical laws and constraints into machine learning algorithms. This study presents a comprehensive overview of PIML techniques in the context of condition monitoring.
arXiv Detail & Related papers (2024-01-22T11:29:44Z)
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective [68.20531518525273]
We take a closer look into existing self-supervised methods of speech from an information-theoretic perspective. We use linear probes to estimate the mutual information between the target information and learned representations. We explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels.
arXiv Detail & Related papers (2024-01-16T21:13:22Z)
Designing monitoring strategies for deployed machine learning algorithms: navigating performativity through a causal lens [6.329470650220206]
The aim of this work is to highlight the relatively under-appreciated complexity of designing a monitoring strategy. We consider an ML-based risk prediction algorithm for predicting unplanned readmissions. Results from this case study emphasize the seemingly simple (and obvious) fact that not all monitoring systems are created equal.
arXiv Detail & Related papers (2023-11-20T00:15:16Z)
Towards Better Modeling with Missing Data: A Contrastive Learning-based Visual Analytics Perspective [7.577040836988683]
Missing data can pose a challenge for machine learning (ML) modeling. Current approaches are categorized into feature imputation and label prediction. This study proposes a Contrastive Learning framework to model observed data with missing values.
arXiv Detail & Related papers (2023-09-18T13:16:24Z)
SoK: Machine Learning for Misinformation Detection [0.8057006406834466]
We examine the disconnect between scholarship and practice in applying machine learning to trust and safety problems.<n>We survey literature on automated detection of misinformation across a corpus of 248 well-cited papers in the field.<n>We conclude that the current state-of-the-art in fully-automated detection has limited efficacy in detecting human-generated misinformation.
arXiv Detail & Related papers (2023-08-23T15:52:20Z)
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions. This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z)
Lightweight Automated Feature Monitoring for Data Streams [1.4658400971135652]
We propose a flexible system, Feature Monitoring (FM), that detects data drifts in such data sets. It monitors all features that are used by the system, while providing an interpretable features ranking whenever an alarm occurs. This illustrates how FM eliminates the need to add custom signals to detect specific types of problems and that monitoring the available space of features is often enough.
arXiv Detail & Related papers (2022-07-18T14:38:11Z)
How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance. We formulate a quality measure for the data set, which we refer to as $rho$-gap. We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.