Related papers: Using Holographically Compressed Embeddings in Question Answering

Using Holographically Compressed Embeddings in Question Answering

URL: http://arxiv.org/abs/2007.07287v1
Date: Tue, 14 Jul 2020 18:29:49 GMT
Title: Using Holographically Compressed Embeddings in Question Answering
Authors: Salvador E. Barbosa
Abstract summary: This research employs holographic compression of pre-trained embeddings to represent a token, its part-of-speech, and named entity type. The implementation, in a modified question answering recurrent deep learning network, shows that semantic relationships are preserved, and yields strong performance.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Word vector representations are central to deep learning natural language processing models. Many forms of these vectors, known as embeddings, exist, including word2vec and GloVe. Embeddings are trained on large corpora and learn the word's usage in context, capturing the semantic relationship between words. However, the semantics from such training are at the level of distinct words (known as word types), and can be ambiguous when, for example, a word type can be either a noun or a verb. In question answering, parts-of-speech and named entity types are important, but encoding these attributes in neural models expands the size of the input. This research employs holographic compression of pre-trained embeddings, to represent a token, its part-of-speech, and named entity type, in the same dimension as representing only the token. The implementation, in a modified question answering recurrent deep learning network, shows that semantic relationships are preserved, and yields strong performance.

Related papers

Conversational Semantic Parsing using Dynamic Context Graphs [68.72121830563906]
We consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types. We focus on models which are capable of interactively mapping user utterances into executable logical forms.
arXiv Detail & Related papers (2023-05-04T16:04:41Z)
Tsetlin Machine Embedding: Representing Words Using Logical Expressions [10.825099126920028]
We introduce a Tsetlin Machine-based autoencoder that learns logical clauses self-supervised. The clauses consist of contextual words like "black," "cup," and "hot" to define other words like "coffee" We evaluate our embedding approach on several intrinsic and extrinsic benchmarks, outperforming GLoVe on six classification tasks.
arXiv Detail & Related papers (2023-01-02T15:02:45Z)
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization [50.16128796194463]
We propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. We introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters. We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level.
arXiv Detail & Related papers (2021-06-23T22:24:14Z)
Deriving Word Vectors from Contextualized Language Models using Topic-Aware Mention Selection [46.97185212695267]
We propose a method for learning word representations that follows this basic strategy. We take advantage of contextualized language models (CLMs) rather than bags of word vectors to encode contexts. We show that this simple strategy leads to high-quality word vectors, which are more predictive of semantic properties than word embeddings and existing CLM-based strategies.
arXiv Detail & Related papers (2021-06-15T08:02:42Z)
Can a Fruit Fly Learn Word Embeddings? [16.280120177501733]
The fruit fly brain is one of the best studied systems in neuroscience. We show that a network motif can learn semantic representations of words and can generate both static and context-dependent word embeddings. It is shown that not only can the fruit fly network motif achieve performance comparable to existing methods in NLP, but, additionally, it uses only a fraction of the computational resources.
arXiv Detail & Related papers (2021-01-18T05:41:50Z)
R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching. We first employ BERT to encode the input sentences from a global perspective. Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective. To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
Morphological Skip-Gram: Using morphological knowledge to improve word representation [2.0129974477913457]
We propose a new method for training word embeddings by replacing the FastText bag of character n-grams for a bag of word morphemes. The results show a competitive performance compared to FastText.
arXiv Detail & Related papers (2020-07-20T12:47:36Z)
Supervised Understanding of Word Embeddings [1.160208922584163]
We have obtained supervised projections in the form of the linear keyword-level classifiers on word embeddings. We have shown that the method creates interpretable projections of original embedding dimensions.
arXiv Detail & Related papers (2020-06-23T20:13:42Z)
Text Classification with Few Examples using Controlled Generalization [58.971750512415134]
Current practice relies on pre-trained word embeddings to map words unseen in training to similar seen ones. Our alternative begins with sparse pre-trained representations derived from unlabeled parsed corpora. We show that a feed-forward network over these vectors is especially effective in low-data scenarios.
arXiv Detail & Related papers (2020-05-18T06:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.