A Context-Contrastive Inference Approach To Partial Diacritization
- URL: http://arxiv.org/abs/2401.08919v3
- Date: Fri, 9 Aug 2024 13:49:20 GMT
- Title: A Context-Contrastive Inference Approach To Partial Diacritization
- Authors: Muhammad ElNokrashy, Badr AlKhamissi,
- Abstract summary: Diacritization plays a pivotal role in improving readability and disambiguating the meaning of Arabic texts.
Partial Diacritzation (PD) is the selection of a subset of characters to be marked to aid comprehension where needed.
We introduce Context-Contrastive Partial Diacritization (CCPD) -- a novel approach to PD which integrates seamlessly with existing Arabic diacritization systems.
- Score: 0.5575959989491791
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diacritization plays a pivotal role in improving readability and disambiguating the meaning of Arabic texts. Efforts have so far focused on marking every eligible character (Full Diacritization). Comparatively overlooked, Partial Diacritzation (PD) is the selection of a subset of characters to be marked to aid comprehension where needed. Research has indicated that excessive diacritic marks can hinder skilled readers -- reducing reading speed and accuracy. We conduct a behavioral experiment and show that partially marked text is often easier to read than fully marked text, and sometimes easier than plain text. In this light, we introduce Context-Contrastive Partial Diacritization (CCPD) -- a novel approach to PD which integrates seamlessly with existing Arabic diacritization systems. CCPD processes each word twice, once with context and once without, and diacritizes only the characters with disparities between the two inferences. Further, we introduce novel indicators for measuring partial diacritization quality, essential for establishing this as a machine learning task. Lastly, we introduce TD2, a Transformer-variant of an established model which offers a markedly different performance profile on our proposed indicators compared to all other known systems.
Related papers
- Beyond Coarse-Grained Matching in Video-Text Retrieval [50.799697216533914]
We introduce a new approach for fine-grained evaluation.
Our approach can be applied to existing datasets by automatically generating hard negative test captions.
Experiments on our fine-grained evaluations demonstrate that this approach enhances a model's ability to understand fine-grained differences.
arXiv Detail & Related papers (2024-10-16T09:42:29Z) - Grammar Induction from Visual, Speech and Text [91.98797120799227]
This work introduces a novel visual-audio-text grammar induction task (textbfVAT-GI)
Inspired by the fact that language grammar exists beyond the texts, we argue that the text has not to be the predominant modality in grammar induction.
We propose a visual-audio-text inside-outside autoencoder (textbfVaTiora) framework, which leverages rich modal-specific and complementary features for effective grammar parsing.
arXiv Detail & Related papers (2024-10-01T02:24:18Z) - PLOT: Text-based Person Search with Part Slot Attention for Corresponding Part Discovery [29.301950609839796]
We propose a novel framework that leverages a part discovery module based on slot attention to autonomously identify and align distinctive parts across modalities.
Our method is evaluated on three public benchmarks, significantly outperforming existing methods.
arXiv Detail & Related papers (2024-09-20T13:05:55Z) - Towards Unified Multi-granularity Text Detection with Interactive Attention [56.79437272168507]
"Detect Any Text" is an advanced paradigm that unifies scene text detection, layout analysis, and document page detection into a cohesive, end-to-end model.
A pivotal innovation in DAT is the across-granularity interactive attention module, which significantly enhances the representation learning of text instances.
Tests demonstrate that DAT achieves state-of-the-art performances across a variety of text-related benchmarks.
arXiv Detail & Related papers (2024-05-30T07:25:23Z) - Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - Take the Hint: Improving Arabic Diacritization with
Partially-Diacritized Text [4.863310073296471]
We propose 2SDiac, a multi-source model that can effectively support optional diacritics in input to inform all predictions.
We also introduce Guided Learning, a training scheme to leverage given diacritics in input with different levels of random masking.
arXiv Detail & Related papers (2023-06-06T10:18:17Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Improving Scene Text Recognition for Character-Level Long-Tailed
Distribution [35.14058653707104]
We propose a novel Context-Aware and Free Experts Network (CAFE-Net) using two experts.
CAFE-Net improves the STR performance on languages containing numerous number of characters.
arXiv Detail & Related papers (2023-03-31T06:11:33Z) - Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer [21.479222207347238]
We introduce TextTranSpotter (TTS), a transformer-based approach for text spotting.
TTS is trained with both fully- and weakly-supervised settings.
trained in a fully-supervised manner, TextTranSpotter shows state-of-the-art results on multiple benchmarks.
arXiv Detail & Related papers (2022-02-11T08:50:09Z) - Text is Text, No Matter What: Unifying Text Recognition using Knowledge
Distillation [41.43280922432707]
We argue for their unification -- we aim for a single model that can compete favourably with two separate state-of-the-art STR and HTR models.
We first show that cross-utilisation of STR and HTR models trigger significant performance drops due to differences in their inherent challenges.
We then tackle their union by introducing a knowledge distillation (KD) based framework.
arXiv Detail & Related papers (2021-07-26T10:10:34Z) - A Novel Attention-based Aggregation Function to Combine Vision and
Language [55.7633883960205]
We propose a novel fully-attentive reduction method for vision and language.
Specifically, our approach computes a set of scores for each element of each modality employing a novel variant of cross-attention.
We test our approach on image-text matching and visual question answering, building fair comparisons with other reduction choices.
arXiv Detail & Related papers (2020-04-27T18:09:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.