Related papers: Automated Prediction of Medieval Arabic Diacritics

Automated Prediction of Medieval Arabic Diacritics

URL: http://arxiv.org/abs/2010.05269v1
Date: Sun, 11 Oct 2020 15:21:01 GMT
Title: Automated Prediction of Medieval Arabic Diacritics
Authors: Khalid Alnajjar, Mika H\"am\"al\"ainen, Niko Partanen, Jack Rueter
Abstract summary: This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic.
Score: 1.290382979353427
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic. The results improve from the online tool used as a baseline. A diacritization model have been published openly through an easy to use Python package available on PyPi and Zenodo. We have found that context size should be considered when optimizing a feasible prediction model.

Related papers

Applied Machine Learning Methods with Long-Short Term Memory Based Recurrent Neural Networks for Multivariate Temperature Prediction [0.0]
This paper gives an overview on how to develop a dense and deep neural network for making a time series prediction. Python's development environment Jupyter, extended with the package Keras is used. The results and evaluation of the work show that a weather prediction with deep neural networks can be successful for a short time period.
arXiv Detail & Related papers (2025-03-08T16:52:27Z)
Simulation-based inference with the Python Package sbijax [0.7499722271664147]
sbijax is a Python package that implements a wide variety of state-of-the-art methods in neural simulation-based inference. The package provides functionality for approximate Bayesian computation, to compute model diagnostics, and to automatically estimate summary statistics.
arXiv Detail & Related papers (2024-09-28T18:47:13Z)
Open-Source Web Service with Morphological Dictionary-Supplemented Deep Learning for Morphosyntactic Analysis of Czech [1.7871207544302354]
We present an open-source web service for Czech morphosyntactic analysis. The system combines a deep learning model with rescoring by a high-precision morphological dictionary at inference time.
arXiv Detail & Related papers (2024-06-18T09:14:58Z)
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours. We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length. This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z)
Aligning a medium-size GPT model in English to a small closed domain in Spanish [0.0]
We propose a methodology to align a medium-sized GPT model, originally trained in English for an open domain, to a small closed domain in Spanish. We also needed to train and implement another neural network that could score and determine whether an answer is appropriate for a given question.
arXiv Detail & Related papers (2023-03-30T18:27:15Z)
Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models. In detail, we first train neural language models with a novel dependency modeling objective. We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z)
Empirical evaluation of shallow and deep learning classifiers for Arabic sentiment analysis [1.1172382217477126]
This work presents a detailed comparison of the performance of deep learning models for sentiment analysis of Arabic reviews. The datasets used in this study are multi-dialect Arabic hotel and book review datasets, which are some of the largest publicly available datasets for Arabic reviews. Results showed deep learning outperforming shallow learning for binary and multi-label classification, in contrast with the results of similar work reported in the literature.
arXiv Detail & Related papers (2021-12-01T14:45:43Z)
Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction. It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition. We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z)
Paraphrastic Representations at Scale [134.41025103489224]
We release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese languages. We train these models on large amounts of data, achieving significantly improved performance from the original papers.
arXiv Detail & Related papers (2021-04-30T16:55:28Z)
Adaptive Semiparametric Language Models [17.53604394786977]
We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component. Experiments on word-based and character-based language modeling datasets demonstrate the efficacy of our proposed method.
arXiv Detail & Related papers (2021-02-04T11:47:03Z)
Comparison of Interactive Knowledge Base Spelling Correction Models for Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict. This work shows a comparison of a neural model and character language models with varying amounts on target language data. Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
On the distance between two neural networks and the stability of learning [59.62047284234815]
This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions. The analysis leads to a new distance function called deep relative trust and a descent lemma for neural networks.
arXiv Detail & Related papers (2020-02-09T19:18:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.