Application of Hierarchical Temporal Memory Theory for Document
Categorization
- URL: http://arxiv.org/abs/2112.14820v1
- Date: Wed, 29 Dec 2021 20:34:03 GMT
- Title: Application of Hierarchical Temporal Memory Theory for Document
Categorization
- Authors: Deven Shah, Pinak Ghate, Manali Paranjape, Amit Kumar
- Abstract summary: The study aims to provide an alternative framework for document categorization using the Spatial Pooler learning algorithm in the HTM Theory.
The results prove that HTM theory, although is in its nascent stages, performs at par with most of the popular machine learning based classifiers.
- Score: 4.506399699599753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The current work intends to study the performance of the Hierarchical
Temporal Memory(HTM) theory for automated classification of text as well as
documents. HTM is a biologically inspired theory based on the working
principles of the human neocortex. The current study intends to provide an
alternative framework for document categorization using the Spatial Pooler
learning algorithm in the HTM Theory. As HTM accepts only a stream of binary
data as input, Latent Semantic Indexing(LSI) technique is used for extracting
the top features from the input and converting them into binary format. The
Spatial Pooler algorithm converts the binary input into sparse patterns with
similar input text having overlapping spatial patterns making it easy for
classifying the patterns into categories. The results obtained prove that HTM
theory, although is in its nascent stages, performs at par with most of the
popular machine learning based classifiers.
Related papers
- Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery [65.16724941038052]
Category-aware Prototype Generation (CPG) and Discrimi Category 5.3% (DCE) are proposed.
CPG enables the model to fully capture the intra-category diversity by representing each category with multiple prototypes.
DCE boosts the discrimination ability of hash code with the guidance of the generated category prototypes.
arXiv Detail & Related papers (2024-10-24T23:51:40Z) - Quantum-inspired classification via efficient simulation of Helstrom measurement [0.3749861135832073]
The Helstrom measurement (HM) is known to be the optimal strategy for distinguishing non-orthogonal quantum states with minimum error.
We present an efficient simulation method for an arbitrary number of copies by utilizing the relationship between HM and state fidelity.
Our method reveals that the classification performance does not improve monotonically with the number of data copies.
arXiv Detail & Related papers (2024-03-22T15:59:21Z) - ConTextual Mask Auto-Encoder for Dense Passage Retrieval [49.49460769701308]
CoT-MAE is a simple yet effective generative pre-training method for dense passage retrieval.
It learns to compress the sentence semantics into a dense vector through self-supervised and context-supervised masked auto-encoding.
We conduct experiments on large-scale passage retrieval benchmarks and show considerable improvements over strong baselines.
arXiv Detail & Related papers (2022-08-16T11:17:22Z) - Linear Temporal Logic Modulo Theories over Finite Traces (Extended
Version) [72.38188258853155]
This paper studies Linear Temporal Logic over Finite Traces (LTLf)
proposition letters are replaced with first-order formulas interpreted over arbitrary theories.
The resulting logic, called Satisfiability Modulo Theories (LTLfMT), is semi-decidable.
arXiv Detail & Related papers (2022-04-28T17:57:33Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - HFT-ONLSTM: Hierarchical and Fine-Tuning Multi-label Text Classification [7.176984223240199]
Hierarchical multi-label text classification (HMTC) with higher accuracy over large sets of closely related categories has become a challenging problem.
We present a hierarchical and fine-tuning approach based on the Ordered Neural LSTM neural network, abbreviated as HFT-ONLSTM, for more accurate level-by-level HMTC.
arXiv Detail & Related papers (2022-04-18T00:57:46Z) - Categorical Representation Learning and RG flow operators for
algorithmic classifiers [0.7519268719195278]
We construct a new natural language processing architecture, called the RG-flow categorifier or for short the RG categorifier, which is capable of data classification and generation in all layers.
In particular we apply the RG categorifier to particular genomic sequences of flu viruses and show how our technology is capable of extracting information from given genomic sequences.
arXiv Detail & Related papers (2022-03-15T15:04:51Z) - Comparative Study of Long Document Classification [0.0]
We revisit long document classification using standard machine learning approaches.
We benchmark approaches ranging from simple Naive Bayes to complex BERT on six standard text classification datasets.
arXiv Detail & Related papers (2021-11-01T04:51:51Z) - Improving Pretrained Models for Zero-shot Multi-label Text
Classification through Reinforced Label Hierarchy Reasoning [18.531022315325583]
Exploiting label hierarchies has become a promising approach to tackling the zero-shot multi-label text classification problem.
We propose a Reinforced Label Hierarchy Reasoning (RLHR) approach to encourage interdependence among labels in the hierarchies during training.
arXiv Detail & Related papers (2021-04-04T19:14:09Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Revisiting LSTM Networks for Semi-Supervised Text Classification via
Mixed Objective Function [106.69643619725652]
We develop a training strategy that allows even a simple BiLSTM model, when trained with cross-entropy loss, to achieve competitive results.
We report state-of-the-art results for text classification task on several benchmark datasets.
arXiv Detail & Related papers (2020-09-08T21:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.