Distributed Word Representation in Tsetlin Machine
- URL: http://arxiv.org/abs/2104.06901v1
- Date: Wed, 14 Apr 2021 14:48:41 GMT
- Title: Distributed Word Representation in Tsetlin Machine
- Authors: Rohan Kumar Yadav, Lei Jiao, Ole-Christoffer Granmo, and Morten
Goodwin
- Abstract summary: Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic.
We propose a novel way of using pre-trained word representations for TM.
The approach significantly enhances TM performance and maintains interpretability at the same time.
- Score: 14.62945824459286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based
on propositional logic. The algorithm has demonstrated competitive performance
in many Natural Language Processing (NLP) tasks, including sentiment analysis,
text classification, and Word Sense Disambiguation (WSD). To obtain human-level
interpretability, legacy TM employs Boolean input features such as bag-of-words
(BOW). However, the BOW representation makes it difficult to use any
pre-trained information, for instance, word2vec and GloVe word representations.
This restriction has constrained the performance of TM compared to deep neural
networks (DNNs) in NLP. To reduce the performance gap, in this paper, we
propose a novel way of using pre-trained word representations for TM. The
approach significantly enhances the TM performance and maintains
interpretability at the same time. We achieve this by extracting semantically
related words from pre-trained word representations as input features to the
TM. Our experiments show that the accuracy of the proposed approach is
significantly higher than the previous BOW-based TM, reaching the level of
DNN-based models.
Related papers
- Bit Cipher -- A Simple yet Powerful Word Representation System that
Integrates Efficiently with Language Models [4.807347156077897]
Bit-cipher is a word representation system that eliminates the need of backpropagation and hyper-efficient dimensionality reduction techniques.
We perform probing experiments on part-of-speech (POS) tagging and named entity recognition (NER) to assess bit-cipher's competitiveness with classic embeddings.
By replacing embedding layers with cipher embeddings, our experiments illustrate the notable efficiency of cipher in accelerating the training process and attaining better optima.
arXiv Detail & Related papers (2023-11-18T08:47:35Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Evaluating Pretrained Transformer Models for Entity Linking in
Task-Oriented Dialog [1.4524096882720263]
We evaluate different pretrained transformer models (PTMs) for understanding short phrases of text.
Several of the PTMs produce sub-par results when compared to traditional techniques.
We find that some of their shortcomings can be addressed by using PTMs fine-tuned for text-similarity tasks.
arXiv Detail & Related papers (2021-12-15T18:20:12Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - Comparing Text Representations: A Theory-Driven Approach [2.893558866535708]
We adapt general tools from computational learning theory to fit the specific characteristics of text datasets.
We present a method to evaluate the compatibility between representations and tasks.
This method provides a calibrated, quantitative measure of the difficulty of a classification-based NLP task.
arXiv Detail & Related papers (2021-09-15T17:48:19Z) - Human Interpretable AI: Enhancing Tsetlin Machine Stochasticity with
Drop Clause [15.981632159103183]
We introduce a novel variant of the Tsetlin machine (TM) that randomly drops the key learning elements of a TM.
We observe from +2% to +4% increase in accuracy and 2x to 4x faster learning.
This is the first time an interpretable machine learning algorithm has been used to produce pixel-level human-interpretable results.
arXiv Detail & Related papers (2021-05-30T11:29:49Z) - Dependency Parsing based Semantic Representation Learning with Graph
Neural Network for Enhancing Expressiveness of Text-to-Speech [49.05471750563229]
We propose a semantic representation learning method based on graph neural network, considering dependency relations of a sentence.
We show that our proposed method outperforms the baseline using vanilla BERT features both in LJSpeech and Bilzzard Challenge 2013 datasets.
arXiv Detail & Related papers (2021-04-14T13:09:51Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Learning Variational Word Masks to Improve the Interpretability of
Neural Text Classifiers [21.594361495948316]
A new line of work on improving model interpretability has just started, and many existing methods require either prior information or human annotations as additional inputs in training.
We propose the variational word mask (VMASK) method to automatically learn task-specific important words and reduce irrelevant information on classification, which ultimately improves the interpretability of model predictions.
arXiv Detail & Related papers (2020-10-01T20:02:43Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z) - Towards Accurate Scene Text Recognition with Semantic Reasoning Networks [52.86058031919856]
We propose a novel end-to-end trainable framework named semantic reasoning network (SRN) for accurate scene text recognition.
GSRM is introduced to capture global semantic context through multi-way parallel transmission.
Results on 7 public benchmarks, including regular text, irregular text and non-Latin long text, verify the effectiveness and robustness of the proposed method.
arXiv Detail & Related papers (2020-03-27T09:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.