Multiple Instance Verification
- URL: http://arxiv.org/abs/2407.06544v2
- Date: Wed, 17 Sep 2025 02:08:53 GMT
- Title: Multiple Instance Verification
- Authors: Xin Xu, Eibe Frank, Geoffrey Holmes,
- Abstract summary: We show that naive adaptations of attention-based multiple instance learning methods and standard verification methods are unsuitable for this setting.<n>We introduce a new pooling approach named "cross-attention pooling" (CAP)<n>Under the CAP framework, we propose two novel attention functions to address the challenge of distinguishing between highly similar instances in a target bag.
- Score: 8.726475130582314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We explore multiple instance verification, a problem setting in which a query instance is verified against a bag of target instances with heterogeneous, unknown relevancy. We show that naive adaptations of attention-based multiple instance learning (MIL) methods and standard verification methods like Siamese neural networks are unsuitable for this setting: directly combining state-of-the-art (SOTA) MIL methods and Siamese networks is shown to be no better, and sometimes significantly worse, than a simple baseline model. Postulating that this may be caused by the failure of the representation of the target bag to incorporate the query instance, we introduce a new pooling approach named "cross-attention pooling" (CAP). Under the CAP framework, we propose two novel attention functions to address the challenge of distinguishing between highly similar instances in a target bag. Through empirical studies on three different verification tasks, we demonstrate that CAP outperforms adaptations of SOTA MIL methods and the baseline by substantial margins, in terms of both classification accuracy and the ability to detect key instances. The superior ability to identify key instances is attributed to the new attention functions by ablation studies. We share our code at https://github.com/xxweka/MIV.
Related papers
- Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection [52.490375806093745]
The objective of few-shot object detection (FSOD) is to detect novel objects with few training samples.
We introduce the side information to alleviate the negative influences derived from the feature space and sample viewpoints.
Our model outperforms the previous state-of-the-art methods, significantly improving the ability of FSOD in most shots/splits.
arXiv Detail & Related papers (2025-04-09T17:24:05Z) - Attention Is Not What You Need: Revisiting Multi-Instance Learning for Whole Slide Image Classification [51.95824566163554]
We argue that synergizing the standard MIL assumption with variational inference encourages the model to focus on tumour morphology instead of spurious correlations.
Our method also achieves better classification boundaries for identifying hard instances and mitigates the effect of spurious correlations between bags and labels.
arXiv Detail & Related papers (2024-08-18T12:15:22Z) - cDP-MIL: Robust Multiple Instance Learning via Cascaded Dirichlet Process [23.266122629592807]
Multiple instance learning (MIL) has been extensively applied to whole slide histoparametric image (WSI) analysis.
The existing aggregation strategy in MIL, which primarily relies on the first-order distance between instances, fails to accurately approximate the true feature distribution of each instance.
We propose a new Bayesian nonparametric framework for multiple instance learning, which adopts a cascade of Dirichlet processes (cDP) to incorporate the instance-to-bag characteristic of the WSIs.
arXiv Detail & Related papers (2024-07-16T07:28:39Z) - Slot-Mixup with Subsampling: A Simple Regularization for WSI
Classification [13.286360560353936]
Whole slide image (WSI) classification requires repetitive zoom-in and out for pathologists, as only small portions of the slide may be relevant to detecting cancer.
Due to the lack of patch-level labels, multiple instance learning (MIL) is a common practice for training a WSI classifier.
One of the challenges in MIL for WSIs is the weak supervision coming only from the slide-level labels, often resulting in severe overfitting.
Our approach augments the training dataset by sampling a subset of patches in the WSI without significantly altering the underlying semantics of the original slides.
arXiv Detail & Related papers (2023-11-29T09:18:39Z) - Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification [12.424186320807888]
We present Attention-Challenging MIL (ACMIL) to mitigate overfitting.
ACMIL combines two techniques based on separate analyses for attention value concentration.
This paper extensively illustrates ACMIL's effectiveness in suppressing attention value concentration and overcoming the overfitting challenge.
arXiv Detail & Related papers (2023-11-13T07:34:53Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Generalizable Metric Network for Cross-domain Person Re-identification [55.71632958027289]
Cross-domain (i.e., domain generalization) scene presents a challenge in Re-ID tasks.
Most existing methods aim to learn domain-invariant or robust features for all domains.
We propose a Generalizable Metric Network (GMN) to explore sample similarity in the sample-pair space.
arXiv Detail & Related papers (2023-06-21T03:05:25Z) - Dual Adaptive Representation Alignment for Cross-domain Few-shot
Learning [58.837146720228226]
Few-shot learning aims to recognize novel queries with limited support samples by learning from base knowledge.
Recent progress in this setting assumes that the base knowledge and novel query samples are distributed in the same domains.
We propose to address the cross-domain few-shot learning problem where only extremely few samples are available in target domains.
arXiv Detail & Related papers (2023-06-18T09:52:16Z) - Learning Classifiers of Prototypes and Reciprocal Points for Universal
Domain Adaptation [79.62038105814658]
Universal Domain aims to transfer the knowledge between datasets by handling two shifts: domain-shift and categoryshift.
Main challenge is correctly distinguishing the unknown target samples while adapting the distribution of known class knowledge from source to target.
Most existing methods approach this problem by first training the target adapted known and then relying on the single threshold to distinguish unknown target samples.
arXiv Detail & Related papers (2022-12-16T09:01:57Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Attention Awareness Multiple Instance Neural Network [4.061135251278187]
We propose an attention awareness multiple instance neural network framework.
It consists of an instance-level classifier, a trainable MIL pooling operator based on spatial attention and a bag-level classification layer.
Exhaustive experiments on a series of pattern recognition tasks demonstrate that our framework outperforms many state-of-the-art MIL methods.
arXiv Detail & Related papers (2022-05-27T03:29:17Z) - Few-shot Forgery Detection via Guided Adversarial Interpolation [56.59499187594308]
Existing forgery detection methods suffer from significant performance drops when applied to unseen novel forgery approaches.
We propose Guided Adversarial Interpolation (GAI) to overcome the few-shot forgery detection problem.
Our method is validated to be robust to choices of majority and minority forgery approaches.
arXiv Detail & Related papers (2022-04-12T16:05:10Z) - Attend to Who You Are: Supervising Self-Attention for Keypoint Detection
and Instance-Aware Association [40.78849763751773]
This paper presents a new method to solve keypoint detection and instance association by using Transformer.
We propose a novel approach of supervising self-attention for multi-person keypoint detection and instance association.
arXiv Detail & Related papers (2021-11-25T03:41:41Z) - Learning to Detect Instance-level Salient Objects Using Complementary
Image Labels [55.049347205603304]
We present the first weakly-supervised approach to the salient instance detection problem.
We propose a novel weakly-supervised network with three branches: a Saliency Detection Branch leveraging class consistency information to locate candidate objects; a Boundary Detection Branch exploiting class discrepancy information to delineate object boundaries; and a Centroid Detection Branch using subitizing information to detect salient instance centroids.
arXiv Detail & Related papers (2021-11-19T10:15:22Z) - Sparse Network Inversion for Key Instance Detection in Multiple Instance
Learning [24.66638752977373]
Multiple Instance Learning (MIL) involves predicting a single label for a bag of instances, given positive or negative labels at bag-level.
The attention-based deep MIL model is a recent advance in both bag-level classification and key instance detection.
We present a method to improve the attention-based deep MIL model in the task of KID.
arXiv Detail & Related papers (2020-09-07T07:01:59Z) - Target Consistency for Domain Adaptation: when Robustness meets
Transferability [8.189696720657247]
Learning Invariant Representations has been successfully applied for reconciling a source and a target domain for Unsupervised Domain Adaptation.
We show that the cluster assumption is violated in the target domain despite being maintained in the source domain.
Our new approach results in a significant improvement, on both image classification and segmentation benchmarks.
arXiv Detail & Related papers (2020-06-25T09:13:00Z) - Dual-stream Maximum Self-attention Multi-instance Learning [11.685285490589981]
Multi-instance learning (MIL) is a form of weakly supervised learning where a single class label is assigned to a bag of instances while the instance-level labels are not available.
We propose a dual-stream maximum self-attention MIL model (DSMIL) parameterized by neural networks.
Our method achieves superior performance compared to the best MIL methods and demonstrates state-of-the-art performance on benchmark MIL datasets.
arXiv Detail & Related papers (2020-06-09T22:44:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.