Neurals Networks for Projecting Named Entities from English to Ewondo
- URL: http://arxiv.org/abs/2004.13841v1
- Date: Sun, 29 Mar 2020 22:05:30 GMT
- Title: Neurals Networks for Projecting Named Entities from English to Ewondo
- Authors: Michael Franklin Mbouopda, Paulin Melatagia Yonta and Guy Stephane B.
Fedim Lombo
- Abstract summary: We propose a new distributional representation of words to project named entities from a rich language to a low-resource one.
Although the proposed method reached appreciable results, the size of the used neural network was too large.
In this paper, we show experimentally that the same results can be obtained using a smaller neural network.
- Score: 6.058868817939519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named entity recognition is an important task in natural language processing.
It is very well studied for rich language, but still under explored for
low-resource languages. The main reason is that the existing techniques
required a lot of annotated data to reach good performance. Recently, a new
distributional representation of words has been proposed to project named
entities from a rich language to a low-resource one. This representation has
been coupled to a neural network in order to project named entities from
English to Ewondo, a Bantu language spoken in Cameroon. Although the proposed
method reached appreciable results, the size of the used neural network was too
large compared to the size of the dataset. Furthermore the impact of the model
parameters has not been studied. In this paper, we show experimentally that the
same results can be obtained using a smaller neural network. We also emphasize
the parameters that are highly correlated to the network performance. This work
is a step forward to build a reliable and robust network architecture for named
entity projection in low resource languages.
Related papers
- NusaWrites: Constructing High-Quality Corpora for Underrepresented and
Extremely Low-Resource Languages [54.808217147579036]
We conduct a case study on Indonesian local languages.
We compare the effectiveness of online scraping, human translation, and paragraph writing by native speakers in constructing datasets.
Our findings demonstrate that datasets generated through paragraph writing by native speakers exhibit superior quality in terms of lexical diversity and cultural content.
arXiv Detail & Related papers (2023-09-19T14:42:33Z) - Multilingual Name Entity Recognition and Intent Classification Employing
Deep Learning Architectures [2.9115403886004807]
We explore the effectiveness of two separate families of Deep Learning networks for named entity recognition and intent classification.
The models were trained and tested on the ATIS benchmark dataset for both English and Greek languages.
arXiv Detail & Related papers (2022-11-04T12:42:29Z) - Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources.
Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages.
We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z) - Low-Resource Language Modelling of South African Languages [6.805575417034369]
We evaluate the performance of open-vocabulary language models on low-resource South African languages.
We evaluate different variants of n-gram models, feedforward neural networks, recurrent neural networks (RNNs) and Transformers on small-scale datasets.
Overall, well-regularized RNNs give the best performance across two isiZulu and one Sepedi datasets.
arXiv Detail & Related papers (2021-04-01T21:27:27Z) - A Novel Deep Learning Method for Textual Sentiment Analysis [3.0711362702464675]
This paper proposes a convolutional neural network integrated with a hierarchical attention layer to extract informative words.
The proposed model has higher classification accuracy and can extract informative words.
Applying incremental transfer learning can significantly enhance the classification performance.
arXiv Detail & Related papers (2021-02-23T12:11:36Z) - Cross-lingual Approach to Abstractive Summarization [0.0]
Cross-lingual model transfers are successfully applied in low-resource languages.
We used a pretrained English summarization model based on deep neural networks and sequence-to-sequence architecture.
We developed several models with different proportions of target language data for fine-tuning.
arXiv Detail & Related papers (2020-12-08T09:30:38Z) - Be More with Less: Hypergraph Attention Networks for Inductive Text
Classification [56.98218530073927]
Graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task.
Despite the success, their performance could be largely jeopardized in practice since they are unable to capture high-order interaction between words.
We propose a principled model -- hypergraph attention networks (HyperGAT) which can obtain more expressive power with less computational consumption for text representation learning.
arXiv Detail & Related papers (2020-11-01T00:21:59Z) - On the Effects of Using word2vec Representations in Neural Networks for
Dialogue Act Recognition [0.6767885381740952]
We propose a new deep neural network that explores recurrent models to capture word sequences within sentences.
We validate this model on three languages: English, French and Czech.
arXiv Detail & Related papers (2020-10-22T07:21:17Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Soft Gazetteers for Low-Resource Named Entity Recognition [78.00856159473393]
We propose a method of "soft gazetteers" that incorporates ubiquitously available information from English knowledge bases into neural named entity recognition models.
Our experiments on four low-resource languages show an average improvement of 4 points in F1 score.
arXiv Detail & Related papers (2020-05-04T21:58:02Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.