Related papers: Reproducibility in Multiple Instance Learning: A Case For Algorithmic Unit Tests

Reproducibility in Multiple Instance Learning: A Case For Algorithmic Unit Tests

URL: http://arxiv.org/abs/2310.17867v1
Date: Fri, 27 Oct 2023 03:05:11 GMT
Title: Reproducibility in Multiple Instance Learning: A Case For Algorithmic Unit Tests
Authors: Edward Raff, James Holt
Abstract summary: Multiple Instance Learning (MIL) is a sub-domain of classification problems with positive and negative labels and a "bag" of inputs. In this work, we examine five of the most prominent deep-MIL models and find that none of them respects the standard MIL assumption. We identify and demonstrate this problem via a proposed "algorithmic unit test", where we create synthetic datasets that can be solved by a MIL respecting model.
Score: 59.623267208433255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multiple Instance Learning (MIL) is a sub-domain of classification problems with positive and negative labels and a "bag" of inputs, where the label is positive if and only if a positive element is contained within the bag, and otherwise is negative. Training in this context requires associating the bag-wide label to instance-level information, and implicitly contains a causal assumption and asymmetry to the task (i.e., you can't swap the labels without changing the semantics). MIL problems occur in healthcare (one malignant cell indicates cancer), cyber security (one malicious executable makes an infected computer), and many other tasks. In this work, we examine five of the most prominent deep-MIL models and find that none of them respects the standard MIL assumption. They are able to learn anti-correlated instances, i.e., defaulting to "positive" labels until seeing a negative counter-example, which should not be possible for a correct MIL model. We suspect that enhancements and other works derived from these models will share the same issue. In any context in which these models are being used, this creates the potential for learning incorrect models, which creates risk of operational failure. We identify and demonstrate this problem via a proposed "algorithmic unit test", where we create synthetic datasets that can be solved by a MIL respecting model, and which clearly reveal learning that violates MIL assumptions. The five evaluated methods each fail one or more of these tests. This provides a model-agnostic way to identify violations of modeling assumptions, which we hope will be useful for future development and evaluation of MIL models.

Related papers

How Effective Can Dropout Be in Multiple Instance Learning ? [2.0792866989795864]
Multiple Instance Learning (MIL) is a popular weakly-supervised method for various applications. We propose a novel MIL-specific dropout method, termed MIL-Dropout, which systematically determines which instances to drop. Experiments on five MIL benchmark datasets and two WSI datasets demonstrate that MIL-Dropout boosts the performance of current MIL methods with a negligible computational cost.
arXiv Detail & Related papers (2025-04-21T00:46:31Z)
Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training [57.771940716189114]
We show that large language models (LLMs) suffer from the "reversal curse" The root cause of the reversal curse lies in the different word order between the training and inference stage. We propose Semantic-aware Permutation Training (SPT) to address this issue.
arXiv Detail & Related papers (2024-03-01T18:55:20Z)
Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines. Academic research is often restrained to public datasets on the order of ten thousand samples. We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z)
Multiple Instance Learning Framework with Masked Hard Instance Mining for Whole Slide Image Classification [11.996318969699296]
Masked hard instance mining (MHIM-MIL) is presented. MHIM-MIL uses a Siamese structure (Teacher-Student) with a consistency constraint to explore potential hard instances. Experimental results on the CAMELYON-16 and TCGA Lung Cancer datasets demonstrate that MHIM-MIL outperforms other latest methods in terms of performance and training cost.
arXiv Detail & Related papers (2023-07-28T01:40:04Z)
ProMIL: Probabilistic Multiple Instance Learning for Medical Imaging [13.355864185650745]
Multiple Instance Learning (MIL) is a weakly-supervised problem in which one label is assigned to the whole bag of instances. We introduce a dedicated instance-based method called ProMIL, based on deep neural networks and Bernstein estimation. We show that ProMIL outperforms standard instance-based MIL in real-world medical applications.
arXiv Detail & Related papers (2023-06-18T11:56:52Z)
Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection [74.80595632328094]
Multiple Instance Learning (MIL) is prevailing in Weakly Supervised Video Anomaly Detection (WSVAD) We propose a new MIL framework: Unbiased MIL (UMIL), to learn unbiased anomaly features that improve WSVAD.
arXiv Detail & Related papers (2023-03-22T08:11:22Z)
Feature Re-calibration based MIL for Whole Slide Image Classification [7.92885032436243]
Whole slide image (WSI) classification is a fundamental task for the diagnosis and treatment of diseases. We propose to re-calibrate the distribution of a WSI bag (instances) by using the statistics of the max-instance (critical) feature. We employ a position encoding module (PEM) to model spatial/morphological information, and perform pooling by multi-head self-attention (PSMA) with a Transformer encoder.
arXiv Detail & Related papers (2022-06-22T07:00:39Z)
Model Agnostic Interpretability for Multiple Instance Learning [7.412445894287708]
In Multiple Instance Learning (MIL), models are trained using bags of instances, where only a single label is provided for each bag. In this work, we establish the key requirements for interpreting MIL models. We then go on to develop several model-agnostic approaches that meet these requirements.
arXiv Detail & Related papers (2022-01-27T17:55:32Z)
CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction [52.94486705393062]
We go beyond typical multi-instance learning (MIL) framework and propose a novel contrastive instance learning (CIL) framework. Specifically, we regard the initial MIL as the relational triple encoder and constraint positive pairs against negative pairs for each instance. Experiments demonstrate the effectiveness of our proposed framework, with significant improvements over the previous methods on NYT10, GDS and KBP.
arXiv Detail & Related papers (2021-06-21T04:51:59Z)
Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning [16.84711797934138]
We address the challenging problem of whole slide image (WSI) classification. WSI classification can be cast as a multiple instance learning (MIL) problem when only slide-level labels are available. We propose a MIL-based method for WSI classification and tumor detection that does not require localized annotations.
arXiv Detail & Related papers (2020-11-17T20:51:15Z)
Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning [82.41415008107502]
Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments) We show that our EM-MIL approach more accurately models both the learning objective and the MIL assumptions.
arXiv Detail & Related papers (2020-03-31T23:36:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.