Beyond Linearity: Squeeze-and-Recalibrate Blocks for Few-Shot Whole Slide Image Classification
- URL: http://arxiv.org/abs/2505.15504v1
- Date: Wed, 21 May 2025 13:24:47 GMT
- Title: Beyond Linearity: Squeeze-and-Recalibrate Blocks for Few-Shot Whole Slide Image Classification
- Authors: Conghao Xiong, Zhengrui Guo, Zhe Xu, Yifei Zhang, Raymond Kai-Yu Tong, Si Yong Yeo, Hao Chen, Joseph J. Y. Sung, Irwin King,
- Abstract summary: We propose a Squeeze-and-Recalibrate (SR) block, a drop-in replacement for linear layers in deep learning models.<n>We provide theoretical guarantees that the SR block can approximate any linear mapping to arbitrary precision.<n>Our SR-MIL models consistently outperform prior methods while requiring significantly fewer parameters and no architectural changes.
- Score: 35.6247241174615
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning has advanced computational pathology but expert annotations remain scarce. Few-shot learning mitigates annotation burdens yet suffers from overfitting and discriminative feature mischaracterization. In addition, the current few-shot multiple instance learning (MIL) approaches leverage pretrained vision-language models to alleviate these issues, but at the cost of complex preprocessing and high computational cost. We propose a Squeeze-and-Recalibrate (SR) block, a drop-in replacement for linear layers in MIL models to address these challenges. The SR block comprises two core components: a pair of low-rank trainable matrices (squeeze pathway, SP) that reduces parameter count and imposes a bottleneck to prevent spurious feature learning, and a frozen random recalibration matrix that preserves geometric structure, diversifies feature directions, and redefines the optimization objective for the SP. We provide theoretical guarantees that the SR block can approximate any linear mapping to arbitrary precision, thereby ensuring that the performance of a standard MIL model serves as a lower bound for its SR-enhanced counterpart. Extensive experiments demonstrate that our SR-MIL models consistently outperform prior methods while requiring significantly fewer parameters and no architectural changes.
Related papers
- UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality [52.49062565901046]
Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone for aligning large language models with human values.<n>Existing approaches struggle to capture the multi-dimensional, distributional nuances of human preferences.<n>We introduce Utility-Conditioned Multi-Objective Alignment (UC-MOA), a novel framework that overcomes these limitations.
arXiv Detail & Related papers (2025-03-10T09:52:42Z) - Pixel to Gaussian: Ultra-Fast Continuous Super-Resolution with 2D Gaussian Modeling [50.34513854725803]
Arbitrary-scale super-resolution (ASSR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) inputs with arbitrary upsampling factors.<n>We propose a novel ContinuousSR framework with a Pixel-to-Gaussian paradigm, which explicitly reconstructs 2D continuous HR signals from LR images using Gaussian Splatting.
arXiv Detail & Related papers (2025-03-09T13:43:57Z) - New Lower Bounds for Stochastic Non-Convex Optimization through Divergence Decomposition [11.530542389959347]
We present fundamental limits of first-order optimization in a range of non-dimensional settings, including L-Convexity (QC), Quadratic Growth (smoothQG), and Restricted Inequalities (RSI)
arXiv Detail & Related papers (2025-02-19T19:21:00Z) - Automatic Debiased Machine Learning for Smooth Functionals of Nonparametric M-Estimands [34.30497962430375]
We propose a unified framework for automatic debiased machine learning (autoDML) to perform inference on smooth functionals of infinite-dimensional M-estimands.<n>We introduce three autoDML estimators based on one-step estimation, targeted minimum loss-based estimation, and the method of sieves.<n>For data-driven model selection, we derive a novel decomposition of model approximation error for smooth functionals of M-estimands.
arXiv Detail & Related papers (2025-01-21T03:50:51Z) - Zeroth-Order Fine-Tuning of LLMs in Random Subspaces [66.27334633749734]
As language models grow in size, memory demands for backpropagation increase.
Zeroth-order (ZOZO) optimization methods offer a memory-efficient alternative.
We show that SubZero enhances fine-tuning and achieves faster results compared to standard ZOZO approaches.
arXiv Detail & Related papers (2024-10-11T17:01:43Z) - A Universal Class of Sharpness-Aware Minimization Algorithms [57.29207151446387]
We introduce a new class of sharpness measures, leading to new sharpness-aware objective functions.
We prove that these measures are textitly expressive, allowing any function of the training loss Hessian matrix to be represented by appropriate hyper and determinants.
arXiv Detail & Related papers (2024-06-06T01:52:09Z) - Data-free Weight Compress and Denoise for Large Language Models [96.68582094536032]
We propose a novel approach termed Data-free Joint Rank-k Approximation for compressing the parameter matrices.<n>We achieve a model pruning of 80% parameters while retaining 93.43% of the original performance without any calibration data.
arXiv Detail & Related papers (2024-02-26T05:51:47Z) - Meta-Learning Adversarial Bandit Algorithms [55.72892209124227]
We study online meta-learning with bandit feedback.
We learn to tune online mirror descent generalization (OMD) with self-concordant barrier regularizers.
arXiv Detail & Related papers (2023-07-05T13:52:10Z) - Reduction of the Number of Variables in Parametric Constrained
Least-Squares Problems [0.20305676256390928]
This paper proposes techniques for reducing the number of involved optimization variables.
We show the good performance of the proposed techniques in numerical tests and in a linearized MPC problem of a nonlinear benchmark process.
arXiv Detail & Related papers (2020-12-18T18:26:40Z) - Deep Low-rank plus Sparse Network for Dynamic MR Imaging [18.09395940969876]
We propose a model-based low-rank plus sparse network, dubbed L+S-Net, for dynamic MR reconstruction.
Experiments on retrospective and prospective cardiac cine datasets show that the proposed model outperforms state-of-the-art CS and existing deep learning methods.
arXiv Detail & Related papers (2020-10-26T15:55:24Z) - Revisiting minimum description length complexity in overparameterized
models [38.21167656112762]
We provide an extensive theoretical characterization of MDL-COMP for linear models and kernel methods.
For kernel methods, we show that MDL-COMP informs minimax in-sample error, and can decrease as the dimensionality of the input increases.
We also prove that MDL-COMP bounds the in-sample mean squared error (MSE)
arXiv Detail & Related papers (2020-06-17T22:45:14Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.