Human Interpretable AI: Enhancing Tsetlin Machine Stochasticity with
Drop Clause
- URL: http://arxiv.org/abs/2105.14506v1
- Date: Sun, 30 May 2021 11:29:49 GMT
- Title: Human Interpretable AI: Enhancing Tsetlin Machine Stochasticity with
Drop Clause
- Authors: Jivitesh Sharma, Rohan Yadav, Ole-Christoffer Granmo and Lei Jiao
- Abstract summary: We introduce a novel variant of the Tsetlin machine (TM) that randomly drops the key learning elements of a TM.
We observe from +2% to +4% increase in accuracy and 2x to 4x faster learning.
This is the first time an interpretable machine learning algorithm has been used to produce pixel-level human-interpretable results.
- Score: 15.981632159103183
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this article, we introduce a novel variant of the Tsetlin machine (TM)
that randomly drops clauses, the key learning elements of a TM. In effect, TM
with drop clause ignores a random selection of the clauses in each epoch,
selected according to a predefined probability. In this way, additional
stochasticity is introduced in the learning phase of TM. Along with producing
more distinct and well-structured patterns that improve the performance, we
also show that dropping clauses increases learning robustness. To explore the
effects clause dropping has on accuracy, training time, and interpretability,
we conduct extensive experiments on various benchmark datasets in natural
language processing (NLP) (IMDb and SST2) as well as computer vision (MNIST and
CIFAR10). In brief, we observe from +2% to +4% increase in accuracy and 2x to
4x faster learning. We further employ the Convolutional TM to document
interpretable results on the CIFAR10 dataset. To the best of our knowledge,
this is the first time an interpretable machine learning algorithm has been
used to produce pixel-level human-interpretable results on CIFAR10. Also,
unlike previous interpretable methods that focus on attention visualisation or
gradient interpretability, we show that the TM is a more general interpretable
method. That is, by producing rule-based propositional logic expressions that
are \emph{human}-interpretable, the TM can explain how it classifies a
particular instance at the pixel level for computer vision and at the word
level for NLP.
Related papers
- Pruning Literals for Highly Efficient Explainability at Word Level [13.249876381579158]
Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic.
In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause.
Experiments on the publicly available YELP-HAT dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map.
arXiv Detail & Related papers (2024-11-07T09:28:38Z) - Exploring Simple Open-Vocabulary Semantic Segmentation [7.245983878396646]
Open-vocabulary semantic segmentation models aim to accurately assign a semantic label to each pixel in an image from a set of arbitrary open-vocabulary texts.
In this paper, we introduce S-Seg, a novel model that can achieve surprisingly strong performance without depending on any of the above elements.
arXiv Detail & Related papers (2024-01-22T18:59:29Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Interpretable Sentence Representation with Variational Autoencoders and
Attention [0.685316573653194]
We develop methods to enhance the interpretability of recent representation learning techniques in natural language processing (NLP)
We leverage Variational Autoencoders (VAEs) due to their efficiency in relating observations to latent generative factors.
We build two models with inductive bias to separate information in latent representations into understandable concepts without annotated data.
arXiv Detail & Related papers (2023-05-04T13:16:15Z) - TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation [53.974228542090046]
Contrastive Language-Image Pre-training (CLIP) has recently shown great promise in pixel-level zero-shot learning tasks.
Existing approaches utilizing CLIP's text and patch embeddings to generate semantic masks often misidentify input pixels from unseen classes.
We propose TagCLIP (Trusty-aware guided CLIP) to address this issue.
arXiv Detail & Related papers (2023-04-15T12:52:23Z) - Prompt Consistency for Zero-Shot Task Generalization [118.81196556175797]
In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance.
Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency.
Our approach outperforms the state-of-the-art zero-shot learner, T0, on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy.
arXiv Detail & Related papers (2022-04-29T19:18:37Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Distributed Word Representation in Tsetlin Machine [14.62945824459286]
Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic.
We propose a novel way of using pre-trained word representations for TM.
The approach significantly enhances TM performance and maintains interpretability at the same time.
arXiv Detail & Related papers (2021-04-14T14:48:41Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.