Interpretability, Then What? Editing Machine Learning Models to Reflect
Human Knowledge and Values
- URL: http://arxiv.org/abs/2206.15465v1
- Date: Thu, 30 Jun 2022 17:57:12 GMT
- Title: Interpretability, Then What? Editing Machine Learning Models to Reflect
Human Knowledge and Values
- Authors: Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally,
Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana
- Abstract summary: We develop GAM Changer, the first interactive system to help data scientists and domain experts edit Generalized Additive Models (GAMs)
With novel interaction techniques, our tool puts interpretability into action--empowering users to analyze, validate, and align model behaviors with their knowledge and values.
- Score: 27.333641578187887
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) interpretability techniques can reveal undesirable
patterns in data that models exploit to make predictions--potentially causing
harms once deployed. However, how to take action to address these patterns is
not always clear. In a collaboration between ML and human-computer interaction
researchers, physicians, and data scientists, we develop GAM Changer, the first
interactive system to help domain experts and data scientists easily and
responsibly edit Generalized Additive Models (GAMs) and fix problematic
patterns. With novel interaction techniques, our tool puts interpretability
into action--empowering users to analyze, validate, and align model behaviors
with their knowledge and values. Physicians have started to use our tool to
investigate and fix pneumonia and sepsis risk prediction models, and an
evaluation with 7 data scientists working in diverse domains highlights that
our tool is easy to use, meets their model editing needs, and fits into their
current workflows. Built with modern web technologies, our tool runs locally in
users' web browsers or computational notebooks, lowering the barrier to use.
GAM Changer is available at the following public demo link:
https://interpret.ml/gam-changer.
Related papers
- Modulating Language Model Experiences through Frictions [56.17593192325438]
Over-consumption of language model outputs risks propagating unchecked errors in the short-term and damaging human capabilities for critical thinking in the long-term.
We propose selective frictions for language model experiences, inspired by behavioral science interventions, to dampen misuse.
arXiv Detail & Related papers (2024-06-24T16:31:11Z) - A Multimodal Automated Interpretability Agent [63.8551718480664]
MAIA is a system that uses neural models to automate neural model understanding tasks.
We first characterize MAIA's ability to describe (neuron-level) features in learned representations of images.
We then show that MAIA can aid in two additional interpretability tasks: reducing sensitivity to spurious features, and automatically identifying inputs likely to be mis-classified.
arXiv Detail & Related papers (2024-04-22T17:55:11Z) - User Friendly and Adaptable Discriminative AI: Using the Lessons from
the Success of LLMs and Image Generation Models [0.6926105253992517]
We develop a new system architecture that enables users to work with discriminative models.
Our approach has implications on improving trust, user-friendliness, and adaptability of these versatile but traditional prediction models.
arXiv Detail & Related papers (2023-12-11T20:37:58Z) - Eliciting Model Steering Interactions from Users via Data and Visual
Design Probes [8.45602005745865]
Domain experts increasingly use automated data science tools to incorporate machine learning (ML) models in their work but struggle to " codify" these models when they are incorrect.
For these experts, semantic interactions can provide an accessible avenue to guide and refine ML models without having to dive into its technical details.
This study examines how experts with a spectrum of ML expertise use semantic interactions to update a simple classification model.
arXiv Detail & Related papers (2023-10-12T20:34:02Z) - Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning.
In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - DIME: Fine-grained Interpretations of Multimodal Models via Disentangled
Local Explanations [119.1953397679783]
We focus on advancing the state-of-the-art in interpreting multimodal models.
Our proposed approach, DIME, enables accurate and fine-grained analysis of multimodal models.
arXiv Detail & Related papers (2022-03-03T20:52:47Z) - GAM Changer: Editing Generalized Additive Models with Interactive
Visualization [28.77745864749409]
We present our work, GAM Changer, an open-source interactive system to help data scientists easily and responsibly edit their Generalized Additive Models (GAMs)
With novel visualization techniques, our tool puts interpretability into action -- empowering human users to analyze, validate, and align model behaviors with their knowledge and values.
arXiv Detail & Related papers (2021-12-06T18:51:49Z) - Towards Model-informed Precision Dosing with Expert-in-the-loop Machine
Learning [0.0]
We consider a ML framework that may accelerate model learning and improve its interpretability by incorporating human experts into the model learning loop.
We propose a novel human-in-the-loop ML framework aimed at dealing with learning problems that the cost of data annotation is high.
With an application to precision dosing, our experimental results show that the approach can learn interpretable rules from data and may potentially lower experts' workload.
arXiv Detail & Related papers (2021-06-28T03:45:09Z) - Intuitively Assessing ML Model Reliability through Example-Based
Explanations and Editing Model Inputs [19.09848738521126]
Interpretability methods aim to help users build trust in and understand the capabilities of machine learning models.
We present two interface modules to facilitate a more intuitive assessment of model reliability.
arXiv Detail & Related papers (2021-02-17T02:41:32Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.