An Empirical Analysis of Backward Compatibility in Machine Learning
Systems
- URL: http://arxiv.org/abs/2008.04572v1
- Date: Tue, 11 Aug 2020 08:10:58 GMT
- Title: An Empirical Analysis of Backward Compatibility in Machine Learning
Systems
- Authors: Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, Eric Horvitz
- Abstract summary: We consider how updates, intended to improve ML models, can introduce new errors that can significantly affect downstream systems and users.
For example, updates in models used in cloud-based classification services, such as image recognition, can cause unexpected erroneous behavior.
- Score: 47.04803977692586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many applications of machine learning (ML), updates are performed with the
goal of enhancing model performance. However, current practices for updating
models rely solely on isolated, aggregate performance analyses, overlooking
important dependencies, expectations, and needs in real-world deployments. We
consider how updates, intended to improve ML models, can introduce new errors
that can significantly affect downstream systems and users. For example,
updates in models used in cloud-based classification services, such as image
recognition, can cause unexpected erroneous behavior in systems that make calls
to the services. Prior work has shown the importance of "backward
compatibility" for maintaining human trust. We study challenges with backward
compatibility across different ML architectures and datasets, focusing on
common settings including data shifts with structured noise and ML employed in
inferential pipelines. Our results show that (i) compatibility issues arise
even without data shift due to optimization stochasticity, (ii) training on
large-scale noisy datasets often results in significant decreases in backward
compatibility even when model accuracy increases, and (iii) distributions of
incompatible points align with noise bias, motivating the need for
compatibility aware de-noising and robustness methods.
Related papers
- Scale-Invariant Learning-to-Rank [0.0]
At Expedia, learning-to-rank models play a key role in sorting and presenting information more relevant to users.
A major challenge in deploying these models is ensuring consistent feature scaling between training and production data.
We introduce a scale-invariant LTR framework which combines a deep and a wide neural network to mathematically guarantee scale-invariance in the model at both training and prediction time.
We evaluate our framework in simulated real-world scenarios with injected feature scale issues by perturbing the test set at prediction time, and show that even with inconsistent train-test scaling, using framework achieves better performance than
arXiv Detail & Related papers (2024-10-02T19:05:12Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecasting [18.50360049235537]
Mamba, a state space model, has emerged with robust sequence and feature mixing capabilities.
Capturing cross-channel dependencies is critical in enhancing performance of time series prediction.
We introduce a refined Mamba variant tailored for time series forecasting.
arXiv Detail & Related papers (2024-06-08T01:32:44Z) - Root Causing Prediction Anomalies Using Explainable AI [3.970146574042422]
We present a novel application of explainable AI (XAI) for root-causing performance degradation in machine learning models.
A single feature corruption can cause cascading feature, label and concept drifts.
We have successfully applied this technique to improve the reliability of models used in personalized advertising.
arXiv Detail & Related papers (2024-03-04T19:38:50Z) - Towards Continually Learning Application Performance Models [1.2278517240988065]
Machine learning-based performance models are increasingly being used to build critical job scheduling and application optimization decisions.
Traditionally, these models assume that data distribution does not change as more samples are collected over time.
We develop continually learning performance models that account for the distribution drift, alleviate catastrophic forgetting, and improve generalizability.
arXiv Detail & Related papers (2023-10-25T20:48:46Z) - Robustness and Generalization Performance of Deep Learning Models on
Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise.
We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z) - How robust are pre-trained models to distribution shift? [82.08946007821184]
We show how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE)
We develop a novel evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation.
arXiv Detail & Related papers (2022-06-17T16:18:28Z) - Switchable Representation Learning Framework with Self-compatibility [50.48336074436792]
We propose a Switchable representation learning Framework with Self-Compatibility (SFSC)
SFSC generates a series of compatible sub-models with different capacities through one training process.
SFSC achieves state-of-the-art performance on the evaluated datasets.
arXiv Detail & Related papers (2022-06-16T16:46:32Z) - AI Total: Analyzing Security ML Models with Imperfect Data in Production [2.629585075202626]
Development of new machine learning models is typically done on manually curated data sets.
We develop a web-based visualization system that allows the users to quickly gather headline performance numbers.
It also enables the users to immediately observe the root cause of an issue when something goes wrong.
arXiv Detail & Related papers (2021-10-13T20:56:05Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.