Representing Unordered Data Using Complex-Weighted Multiset Automata
- URL: http://arxiv.org/abs/2001.00610v3
- Date: Fri, 28 Aug 2020 14:11:26 GMT
- Title: Representing Unordered Data Using Complex-Weighted Multiset Automata
- Authors: Justin DeBenedetto, David Chiang
- Abstract summary: We show how the multiset representations of certain existing neural architectures can be viewed as special cases of ours.
Namely, we provide a new theoretical and intuitive justification for the Transformer model's representation of positions using sinusoidal functions.
We extend the DeepSets model to use complex numbers, enabling it to outperform the existing model on an extension of one of their tasks.
- Score: 23.68657135308002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unordered, variable-sized inputs arise in many settings across multiple
fields. The ability for set- and multiset-oriented neural networks to handle
this type of input has been the focus of much work in recent years. We propose
to represent multisets using complex-weighted multiset automata and show how
the multiset representations of certain existing neural architectures can be
viewed as special cases of ours. Namely, (1) we provide a new theoretical and
intuitive justification for the Transformer model's representation of positions
using sinusoidal functions, and (2) we extend the DeepSets model to use complex
numbers, enabling it to outperform the existing model on an extension of one of
their tasks.
Related papers
- SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection [73.49799596304418]
This paper introduces a new task called Multi-Modal datasets and Multi-Task Object Detection (M2Det) for remote sensing.
It is designed to accurately detect horizontal or oriented objects from any sensor modality.
This task poses challenges due to 1) the trade-offs involved in managing multi-modal modelling and 2) the complexities of multi-task optimization.
arXiv Detail & Related papers (2024-12-30T02:47:51Z) - MambaPro: Multi-Modal Object Re-Identification with Mamba Aggregation and Synergistic Prompt [60.10555128510744]
Multi-modal object Re-IDentification (ReID) aims to retrieve specific objects by utilizing complementary image information from different modalities.
Recently, large-scale pre-trained models like CLIP have demonstrated impressive performance in traditional single-modal object ReID tasks.
We introduce a novel framework called MambaPro for multi-modal object ReID.
arXiv Detail & Related papers (2024-12-14T06:33:53Z) - Multiset Transformer: Advancing Representation Learning in Persistence Diagrams [11.512742322405906]
Multiset Transformer is a neural network that utilizes attention mechanisms specifically designed for multisets as inputs.
The architecture integrates multiset-enhanced attentions with a pool-decomposition scheme, allowing multiplicities to be preserved across equivariant layers.
Experimental results demonstrate that the Multiset Transformer outperforms existing neural network methods in the realm of persistence diagram representation learning.
arXiv Detail & Related papers (2024-11-22T01:38:47Z) - U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation [63.31007867379312]
We introduce U3M: An Unbiased Multiscale Modal Fusion Model for Multimodal Semantics.
We employ feature fusion at multiple scales to ensure the effective extraction and integration of both global and local features.
Experimental results demonstrate that our approach achieves superior performance across multiple datasets.
arXiv Detail & Related papers (2024-05-24T08:58:48Z) - Generative Multimodal Models are In-Context Learners [60.50927925426832]
We introduce Emu2, a generative multimodal model with 37 billion parameters, trained on large-scale multimodal sequences.
Emu2 exhibits strong multimodal in-context learning abilities, even emerging to solve tasks that require on-the-fly reasoning.
arXiv Detail & Related papers (2023-12-20T18:59:58Z) - Bi-directional Adapter for Multi-modal Tracking [67.01179868400229]
We propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter.
We develop a simple but effective light feature adapter to transfer modality-specific information from one modality to another.
Our model achieves superior tracking performance in comparison with both the full fine-tuning methods and the prompt learning-based methods.
arXiv Detail & Related papers (2023-12-17T05:27:31Z) - Modular Blended Attention Network for Video Question Answering [1.131316248570352]
We present an approach to facilitate the question with a reusable and composable neural unit.
We have conducted experiments on three commonly used datasets.
arXiv Detail & Related papers (2023-11-02T14:22:17Z) - Unsupervised Multimodal Language Representations using Convolutional
Autoencoders [5.464072883537924]
We propose extracting unsupervised Multimodal Language representations that are universal and can be applied to different tasks.
We map the word-level aligned multimodal sequences to 2-D matrices and then use Convolutional Autoencoders to learn embeddings by combining multiple datasets.
It is also shown that our method is extremely lightweight and can be easily generalized to other tasks and unseen data with small performance drop and almost the same number of parameters.
arXiv Detail & Related papers (2021-10-06T18:28:07Z) - Abelian Neural Networks [48.52497085313911]
We first construct a neural network architecture for Abelian group operations and derive a universal approximation property.
We extend it to Abelian semigroup operations using the characterization of associative symmetrics.
We train our models over fixed word embeddings and demonstrate improved performance over the original word2vec.
arXiv Detail & Related papers (2021-02-24T11:52:21Z) - DynE: Dynamic Ensemble Decoding for Multi-Document Summarization [5.197307534263253]
We propose a simple decoding methodology which ensembles the output of multiple instances of the same model on different inputs.
We obtain state-of-the-art results on several multi-document summarization datasets.
arXiv Detail & Related papers (2020-06-15T20:40:06Z) - Deep Multi-Modal Sets [29.983311598563542]
Deep Multi-Modal Sets is a technique that represents a collection of features as an unordered set rather than one long ever-growing fixed-size vector.
We demonstrate a scalable, multi-modal framework that reasons over different modalities to learn various types of tasks.
arXiv Detail & Related papers (2020-03-03T15:48:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.