Augmentation of base classifier performance via HMMs on a handwritten
character data set
- URL: http://arxiv.org/abs/2111.10204v1
- Date: Wed, 17 Nov 2021 15:22:47 GMT
- Title: Augmentation of base classifier performance via HMMs on a handwritten
character data set
- Authors: H\'elder Campos and Nuno Paulino
- Abstract summary: This paper presents results of a study of the performance of several base classifiers for recognition of handwritten characters of the modern Latin alphabet.
The best classification performance after correction was 89.8%, and the average was 68.1%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents results of a study of the performance of several base
classifiers for recognition of handwritten characters of the modern Latin
alphabet. Base classification performance is further enhanced by utilizing
Viterbi error correction by determining the Viterbi sequence. Hidden Markov
Models (HMMs) models exploit relationships between letters within a word to
determine the most likely sequence of characters. Four base classifiers are
studied along with eight feature sets extracted from the handwritten dataset.
The best classification performance after correction was 89.8%, and the average
was 68.1%
Related papers
- Self-Supervised Learning Based Handwriting Verification [23.983430206133793]
We show that ResNet based Variational Auto-Encoder (VAE) outperforms other generative approaches achieving 76.3% accuracy.
Using a pre-trained VAE and VICReg for the downstream task of writer verification we observed a relative improvement in accuracy of 6.7% and 9% over ResNet-18 supervised baseline with 10% writer labels.
arXiv Detail & Related papers (2024-05-28T16:11:11Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Mitigating Word Bias in Zero-shot Prompt-based Classifiers [55.60306377044225]
We show that matching class priors correlates strongly with the oracle upper bound performance.
We also demonstrate large consistent performance gains for prompt settings over a range of NLP tasks.
arXiv Detail & Related papers (2023-09-10T10:57:41Z) - Prompt Algebra for Task Composition [131.97623832435812]
We consider Visual Language Models with prompt tuning as our base classifier.
We propose constrained prompt tuning to improve performance of the composite classifier.
On UTZappos it improves classification accuracy over the best base model by 8.45% on average.
arXiv Detail & Related papers (2023-06-01T03:20:54Z) - AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning [53.32576252950481]
Continual learning aims to enable a model to incrementally learn knowledge from sequentially arrived data.
In this paper, we propose a non-incremental learner, named AttriCLIP, to incrementally extract knowledge of new classes or tasks.
arXiv Detail & Related papers (2023-05-19T07:39:17Z) - Comparison Study Between Token Classification and Sequence
Classification In Text Classification [0.45687771576879593]
Unsupervised Machine Learning techniques have been applied to Natural Language Processing tasks and surpasses the benchmarks such as GLUE with great success.
Building language models approach good results in one language and it can be applied to multiple NLP task such as classification, summarization, generation and etc as out of box models.
arXiv Detail & Related papers (2022-11-25T05:14:58Z) - Language Model Classifier Aligns Better with Physician Word Sensitivity
than XGBoost on Readmission Prediction [86.15787587540132]
We introduce sensitivity score, a metric that scrutinizes models' behaviors at the vocabulary level.
Our experiments compare the decision-making logic of clinicians and classifiers based on rank correlations of sensitivity scores.
arXiv Detail & Related papers (2022-11-13T23:59:11Z) - Layer or Representation Space: What makes BERT-based Evaluation Metrics
Robust? [29.859455320349866]
In this paper, we examine the robustness of BERTScore, one of the most popular embedding-based metrics for text generation.
We show that (a) an embedding-based metric that has the highest correlation with human evaluations on a standard benchmark can have the lowest correlation if the amount of input noise or unknown tokens increases.
arXiv Detail & Related papers (2022-09-06T09:10:54Z) - Text Classification and Clustering with Annealing Soft Nearest Neighbor
Loss [0.0]
We use disentanglement to learn better natural language representation.
We employ it on text classification and text clustering tasks.
Our approach had a test classification accuracy of as high as 90.11% and test clustering accuracy of 88% on the AG News dataset.
arXiv Detail & Related papers (2021-07-23T09:05:39Z) - Unsupervised Document Embedding via Contrastive Augmentation [48.71917352110245]
We present a contrasting learning approach with data augmentation techniques to learn document representations in unsupervised manner.
Inspired by recent contrastive self-supervised learning algorithms used for image and pretraining, we hypothesize that high-quality document embedding should be invariant to diverse paraphrases.
Our method can decrease the classification error rate by up to 6.4% over the SOTA approaches on the document classification task, matching or even surpassing fully-supervised methods.
arXiv Detail & Related papers (2021-03-26T15:48:52Z) - Exploiting Class Labels to Boost Performance on Embedding-based Text
Classification [16.39344929765961]
embeddings of different kinds have recently become the de facto standard as features used for text classification.
We introduce a weighting scheme, Term Frequency-Category Ratio (TF-CR), which can weight high-frequency, category-exclusive words higher when computing word embeddings.
arXiv Detail & Related papers (2020-06-03T08:53:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.