Variational Information Pursuit for Interpretable Predictions
- URL: http://arxiv.org/abs/2302.02876v1
- Date: Mon, 6 Feb 2023 15:43:48 GMT
- Title: Variational Information Pursuit for Interpretable Predictions
- Authors: Aditya Chattopadhyay, Kwan Ho Ryan Chan, Benjamin D. Haeffele, Donald
Geman, Ren\'e Vidal
- Abstract summary: Variational Information Pursuit (V-IP) is a variational characterization of IP which bypasses the need for learning generative models.
V-IP finds much shorter query chains when compared to reinforcement learning which is typically used in sequential-decision-making problems.
We demonstrate the utility of V-IP on challenging tasks like medical diagnosis where the performance is far superior to the generative modelling approach.
- Score: 8.894670614193677
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There is a growing interest in the machine learning community in developing
predictive algorithms that are "interpretable by design". Towards this end,
recent work proposes to make interpretable decisions by sequentially asking
interpretable queries about data until a prediction can be made with high
confidence based on the answers obtained (the history). To promote short
query-answer chains, a greedy procedure called Information Pursuit (IP) is
used, which adaptively chooses queries in order of information gain. Generative
models are employed to learn the distribution of query-answers and labels,
which is in turn used to estimate the most informative query. However, learning
and inference with a full generative model of the data is often intractable for
complex tasks. In this work, we propose Variational Information Pursuit (V-IP),
a variational characterization of IP which bypasses the need for learning
generative models. V-IP is based on finding a query selection strategy and a
classifier that minimizes the expected cross-entropy between true and predicted
labels. We then demonstrate that the IP strategy is the optimal solution to
this problem. Therefore, instead of learning generative models, we can use our
optimal strategy to directly pick the most informative query given any history.
We then develop a practical algorithm by defining a finite-dimensional
parameterization of our strategy and classifier using deep networks and train
them end-to-end using our objective. Empirically, V-IP is 10-100x faster than
IP on different Vision and NLP tasks with competitive performance. Moreover,
V-IP finds much shorter query chains when compared to reinforcement learning
which is typically used in sequential-decision-making problems. Finally, we
demonstrate the utility of V-IP on challenging tasks like medical diagnosis
where the performance is far superior to the generative modelling approach.
Related papers
- Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - IGANN Sparse: Bridging Sparsity and Interpretability with Non-linear Insight [4.010646933005848]
IGANN Sparse is a novel machine learning model from the family of generalized additive models.
It promotes sparsity through a non-linear feature selection process during training.
This ensures interpretability through improved model sparsity without sacrificing predictive performance.
arXiv Detail & Related papers (2024-03-17T22:44:36Z) - Type-based Neural Link Prediction Adapter for Complex Query Answering [2.1098688291287475]
We propose TypE-based Neural Link Prediction Adapter (TENLPA), a novel model that constructs type-based entity-relation graphs.
In order to effectively combine type information with complex logical queries, an adaptive learning mechanism is introduced.
Experiments on 3 standard datasets show that TENLPA model achieves state-of-the-art performance on complex query answering.
arXiv Detail & Related papers (2024-01-29T10:54:28Z) - Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks.
We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations.
We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z) - NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision
Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks.
Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth.
Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z) - A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - Interpretable by Design: Learning Predictors by Composing Interpretable
Queries [8.054701719767293]
We argue that machine learning algorithms should be interpretable by design.
We minimize the expected number of queries needed for accurate prediction.
Experiments on vision and NLP tasks demonstrate the efficacy of our approach.
arXiv Detail & Related papers (2022-07-03T02:40:34Z) - Near-optimal Offline Reinforcement Learning with Linear Representation:
Leveraging Variance Information with Pessimism [65.46524775457928]
offline reinforcement learning seeks to utilize offline/historical data to optimize sequential decision-making strategies.
We study the statistical limits of offline reinforcement learning with linear model representations.
arXiv Detail & Related papers (2022-03-11T09:00:12Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Network Support for High-performance Distributed Machine Learning [17.919773898228716]
We propose a system model that captures both learning nodes (that perform computations) and information nodes (that provide data)
We then formulate the problem of selecting (i) which learning and information nodes should cooperate to complete the learning task, and (ii) the number of iterations to perform.
We devise an algorithm, named DoubleClimb, that can find a 1+1/|I|-competitive solution with cubic worst-case complexity.
arXiv Detail & Related papers (2021-02-05T19:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.