Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased
Full Set Gradient Approximation
- URL: http://arxiv.org/abs/2208.12401v5
- Date: Thu, 8 Jun 2023 05:50:50 GMT
- Title: Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased
Full Set Gradient Approximation
- Authors: Jeffrey Willette, Seanie Lee, Bruno Andreis, Kenji Kawaguchi, Juho
Lee, Sung Ju Hwang
- Abstract summary: We propose a Universally MBC class of set functions which can be used in conjunction with arbitrary non-MBC components.
We also propose an efficient MBC training algorithm which gives an unbiased approximation of the full set gradient.
- Score: 74.43046004300507
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work on mini-batch consistency (MBC) for set functions has brought
attention to the need for sequentially processing and aggregating chunks of a
partitioned set while guaranteeing the same output for all partitions. However,
existing constraints on MBC architectures lead to models with limited
expressive power. Additionally, prior work has not addressed how to deal with
large sets during training when the full set gradient is required. To address
these issues, we propose a Universally MBC (UMBC) class of set functions which
can be used in conjunction with arbitrary non-MBC components while still
satisfying MBC, enabling a wider range of function classes to be used in MBC
settings. Furthermore, we propose an efficient MBC training algorithm which
gives an unbiased approximation of the full set gradient and has a constant
memory overhead for any set size for both train- and test-time. We conduct
extensive experiments including image completion, text classification,
unsupervised clustering, and cancer detection on high-resolution images to
verify the efficiency and efficacy of our scalable set encoding framework. Our
code is available at github.com/jeffwillette/umbc
Related papers
- Revisiting the Integration of Convolution and Attention for Vision Backbone [59.50256661158862]
Convolutions and multi-head self-attentions (MHSAs) are typically considered alternatives to each other for building vision backbones.
We propose in this work to use MSHAs and Convs in parallel textbfat different granularity levels instead.
We empirically verify the potential of the proposed integration scheme, named textitGLMix: by offloading the burden of fine-grained features to light-weight Convs, it is sufficient to use MHSAs in a few semantic slots.
arXiv Detail & Related papers (2024-11-21T18:59:08Z) - SCHEME: Scalable Channel Mixer for Vision Transformers [52.605868919281086]
Vision Transformers have achieved impressive performance in many vision tasks.
Much less research has been devoted to the channel mixer or feature mixing block (FFN or)
We show that the dense connections can be replaced with a diagonal block structure that supports larger expansion ratios.
arXiv Detail & Related papers (2023-12-01T08:22:34Z) - Mini-Batch Optimization of Contrastive Loss [13.730030395850358]
We show that mini-batch optimization is equivalent to full-batch optimization if and only if all $binomNB$ mini-batches are selected.
We then demonstrate that utilizing high-loss mini-batches can speed up SGD convergence and propose a spectral clustering-based approach for identifying these high-loss mini-batches.
arXiv Detail & Related papers (2023-07-12T04:23:26Z) - MA-BBOB: Many-Affine Combinations of BBOB Functions for Evaluating
AutoML Approaches in Noiseless Numerical Black-Box Optimization Contexts [0.8258451067861933]
(MA-)BBOB is built on the publicly available IOHprofiler platform.
It provides access to the interactive IOHanalyzer module for performance analysis and visualization, and enables comparisons with the rich and growing data collection available for the (MA-)BBOB functions.
arXiv Detail & Related papers (2023-06-18T19:32:12Z) - Binarized Spectral Compressive Imaging [59.18636040850608]
Existing deep learning models for hyperspectral image (HSI) reconstruction achieve good performance but require powerful hardwares with enormous memory and computational resources.
We propose a novel method, Binarized Spectral-Redistribution Network (BiSRNet)
BiSRNet is derived by using the proposed techniques to binarize the base model.
arXiv Detail & Related papers (2023-05-17T15:36:08Z) - Efficient Image Super-Resolution with Feature Interaction Weighted Hybrid Network [101.53907377000445]
Lightweight image super-resolution aims to reconstruct high-resolution images from low-resolution images using low computational costs.
Existing methods result in the loss of middle-layer features due to activation functions.
We propose a Feature Interaction Weighted Hybrid Network (FIWHN) to minimize the impact of intermediate feature loss on reconstruction quality.
arXiv Detail & Related papers (2022-12-29T05:57:29Z) - BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons [37.28828605119602]
This paper studies the problem of designing compact binary architectures for vision multi-layer perceptrons (MLPs)
We find that previous binarization methods perform poorly due to limited capacity of binary samplings.
We propose to improve the performance of binary mixing and channel mixing (BiMLP) model by enriching the representation ability of binary FC layers.
arXiv Detail & Related papers (2022-12-29T02:43:41Z) - Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding [50.61114177411961]
We introduce a new property termed Mini-Batch Consistency that is required for large scale mini-batch set encoding.
We present a scalable and efficient set encoding mechanism that is amenable to mini-batch processing with respect to set elements and capable of updating set representations as more data arrives.
arXiv Detail & Related papers (2021-03-02T10:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.