Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding
- URL: http://arxiv.org/abs/2103.01615v1
- Date: Tue, 2 Mar 2021 10:10:41 GMT
- Title: Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding
- Authors: Bruno Andreis, Jeffrey Willette, Juho Lee, Sung Ju Hwang
- Abstract summary: We introduce a new property termed Mini-Batch Consistency that is required for large scale mini-batch set encoding.
We present a scalable and efficient set encoding mechanism that is amenable to mini-batch processing with respect to set elements and capable of updating set representations as more data arrives.
- Score: 50.61114177411961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most existing set encoding algorithms operate under the assumption that all
the elements of the set are accessible during training and inference.
Additionally, it is assumed that there are enough computational resources
available for concurrently processing sets of large cardinality. However, both
assumptions fail when the cardinality of the set is prohibitively large such
that we cannot even load the set into memory. In more extreme cases, the set
size could be potentially unlimited, and the elements of the set could be given
in a streaming manner, where the model receives subsets of the full set data at
irregular intervals. To tackle such practical challenges in large-scale set
encoding, we go beyond the usual constraints of invariance and equivariance and
introduce a new property termed Mini-Batch Consistency that is required for
large scale mini-batch set encoding. We present a scalable and efficient set
encoding mechanism that is amenable to mini-batch processing with respect to
set elements and capable of updating set representations as more data arrives.
The proposed method respects the required symmetries of invariance and
equivariance as well as being Mini-Batch Consistent for random partitions of
the input set. We perform extensive experiments and show that our method is
computationally efficient and results in rich set encoding representations for
set-structured data.
Related papers
- Tokenize Image as a Set [17.142970970610616]
We introduce an unordered token set representation to dynamically allocate coding capacity based on regional semantic complexity.
To address the challenge of modeling discrete sets, we devise a dual transformation mechanism that transforms sets into fixed-length integer sequences.
Experiments demonstrate our method's superiority in semantic-aware representation and generation quality.
arXiv Detail & Related papers (2025-03-20T17:59:51Z) - Outfit Completion via Conditional Set Transformation [10.075094678260625]
We formulate the outfit completion problem as a set retrieval task and propose a novel framework for solving this problem.
The proposal includes a conditional set transformation architecture with deep neural networks and a compatibility-based regularization method.
Experimental results on real data reveal that the proposed method outperforms existing approaches in terms of accuracy of the outfit completion task, condition satisfaction, and compatibility of completion results.
arXiv Detail & Related papers (2023-11-28T09:30:52Z) - Efficient Controllable Multi-Task Architectures [85.76598445904374]
We propose a multi-task model consisting of a shared encoder and task-specific decoders where both encoder and decoder channel widths are slimmable.
Our key idea is to control the task importance by varying the capacities of task-specific decoders, while controlling the total computational cost.
This improves overall accuracy by allowing a stronger encoder for a given budget, increases control over computational cost, and delivers high-quality slimmed sub-architectures.
arXiv Detail & Related papers (2023-08-22T19:09:56Z) - Achieving Long-term Fairness in Submodular Maximization through
Randomization [16.33001220320682]
It is important to implement fairness-aware algorithms when dealing with data items that may contain sensitive attributes like race or gender.
We investigate the problem of maximizing a monotone submodular function while meeting group fairness constraints.
arXiv Detail & Related papers (2023-04-10T16:39:19Z) - Towards Practical Few-Shot Query Sets: Transductive Minimum Description
Length Inference [0.0]
We introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task.
Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task.
arXiv Detail & Related papers (2022-10-26T08:06:57Z) - Set2Box: Similarity Preserving Representation Learning of Sets [18.85308805841525]
We propose Set2Box, a learning-based approach for compressed representations of sets.
We also design Set2Box+, which yields more concise but more accurate box representations of sets.
Through experiments on 8 real-world datasets, we show that Set2Box+ is (a) Accurate: achieving up to 40.8X smaller estimation error while requiring 60% fewer bits to encode sets, (b) Concise: yielding up to 96.8X more concise representations with similar estimation error, and (c) Versatile: enabling the estimation of four set-similarity measures from a single representation of each set.
arXiv Detail & Related papers (2022-10-07T02:11:12Z) - Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased
Full Set Gradient Approximation [74.43046004300507]
We propose a Universally MBC class of set functions which can be used in conjunction with arbitrary non-MBC components.
We also propose an efficient MBC training algorithm which gives an unbiased approximation of the full set gradient.
arXiv Detail & Related papers (2022-08-26T02:13:38Z) - Automatic Mixed-Precision Quantization Search of BERT [62.65905462141319]
Pre-trained language models such as BERT have shown remarkable effectiveness in various natural language processing tasks.
These models usually contain millions of parameters, which prevents them from practical deployment on resource-constrained devices.
We propose an automatic mixed-precision quantization framework designed for BERT that can simultaneously conduct quantization and pruning in a subgroup-wise level.
arXiv Detail & Related papers (2021-12-30T06:32:47Z) - An Efficient Diagnosis Algorithm for Inconsistent Constraint Sets [68.8204255655161]
We introduce a divide-and-conquer based diagnosis algorithm (FastDiag) which identifies minimal sets of faulty constraints in an over-constrained problem.
We compare FastDiag with the conflict-directed calculation of hitting sets and present an in-depth performance analysis.
arXiv Detail & Related papers (2021-02-17T19:55:42Z) - Learn to Predict Sets Using Feed-Forward Neural Networks [63.91494644881925]
This paper addresses the task of set prediction using deep feed-forward neural networks.
We present a novel approach for learning to predict sets with unknown permutation and cardinality.
We demonstrate the validity of our set formulations on relevant vision problems.
arXiv Detail & Related papers (2020-01-30T01:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.