SubICap: Towards Subword-informed Image Captioning
- URL: http://arxiv.org/abs/2012.13122v1
- Date: Thu, 24 Dec 2020 06:10:36 GMT
- Title: SubICap: Towards Subword-informed Image Captioning
- Authors: Naeha Sharif, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah
- Abstract summary: We decompose words into smaller constituent units'subwords' and represent captions as a sequence of subwords instead of words.
Our captioning system improves various metric scores, with a training vocabulary size approximately 90% less than the baseline.
- Score: 37.42085521950802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Image Captioning (IC) systems model words as atomic units in
captions and are unable to exploit the structural information in the words.
This makes representation of rare words very difficult and out-of-vocabulary
words impossible. Moreover, to avoid computational complexity, existing IC
models operate over a modest sized vocabulary of frequent words, such that the
identity of rare words is lost. In this work we address this common limitation
of IC systems in dealing with rare words in the corpora. We decompose words
into smaller constituent units 'subwords' and represent captions as a sequence
of subwords instead of words. This helps represent all words in the corpora
using a significantly lower subword vocabulary, leading to better parameter
learning. Using subword language modeling, our captioning system improves
various metric scores, with a training vocabulary size approximately 90% less
than the baseline and various state-of-the-art word-level models. Our
quantitative and qualitative results and analysis signify the efficacy of our
proposed approach.
Related papers
- Morphological evaluation of subwords vocabulary used by BETO language model [0.1638581561083717]
Subword tokenization algorithms are more efficient and can independently build the necessary vocabulary of words and subwords without human intervention.
In previous research, we proposed a method to assess the morphological quality of vocabularies, focusing on the overlap between these vocabularies and the morphemes of a given language.
By applying this method to vocabularies created by three subword tokenization algorithms, BPE, Wordpiece, and Unigram, we concluded that these vocabularies generally exhibit very low morphological quality.
This evaluation helps clarify the algorithm used by the tokenizer, that is, Wordpiece, given the inconsistencies between the authors' claims
arXiv Detail & Related papers (2024-10-03T08:07:14Z) - An Analysis of BPE Vocabulary Trimming in Neural Machine Translation [56.383793805299234]
vocabulary trimming is a postprocessing step that replaces rare subwords with their component subwords.
We show that vocabulary trimming fails to improve performance and is even prone to incurring heavy degradation.
arXiv Detail & Related papers (2024-03-30T15:29:49Z) - From Characters to Words: Hierarchical Pre-trained Language Model for
Open-vocabulary Language Understanding [22.390804161191635]
Current state-of-the-art models for natural language understanding require a preprocessing step to convert raw text into discrete tokens.
This process known as tokenization relies on a pre-built vocabulary of words or sub-word morphemes.
We introduce a novel open-vocabulary language model that adopts a hierarchical two-level approach.
arXiv Detail & Related papers (2023-05-23T23:22:20Z) - Short-Term Word-Learning in a Dynamically Changing Environment [63.025297637716534]
We show how to supplement an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
We demonstrate significant improvements in the detection rate of new words with only a minor increase in false alarms.
arXiv Detail & Related papers (2022-03-29T10:05:39Z) - Instant One-Shot Word-Learning for Context-Specific Neural
Sequence-to-Sequence Speech Recognition [62.997667081978825]
We present an end-to-end ASR system with a word/phrase memory and a mechanism to access this memory to recognize the words and phrases correctly.
In this paper we demonstrate that through this mechanism our system is able to recognize more than 85% of newly added words that it previously failed to recognize.
arXiv Detail & Related papers (2021-07-05T21:08:34Z) - Accurate Word Representations with Universal Visual Guidance [55.71425503859685]
This paper proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance.
We build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images.
Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach.
arXiv Detail & Related papers (2020-12-30T09:11:50Z) - Morphological Skip-Gram: Using morphological knowledge to improve word
representation [2.0129974477913457]
We propose a new method for training word embeddings by replacing the FastText bag of character n-grams for a bag of word morphemes.
The results show a competitive performance compared to FastText.
arXiv Detail & Related papers (2020-07-20T12:47:36Z) - Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems [54.49880724137688]
The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
arXiv Detail & Related papers (2020-03-19T21:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.