An exploratory experiment on Hindi, Bengali hate-speech detection and
transfer learning using neural networks
- URL: http://arxiv.org/abs/2201.01997v1
- Date: Thu, 6 Jan 2022 10:13:28 GMT
- Title: An exploratory experiment on Hindi, Bengali hate-speech detection and
transfer learning using neural networks
- Authors: Tung Minh Phung, Jan Cloos
- Abstract summary: This work presents our approach to train a neural network to detect hate-speech texts in Hindi and Bengali.
We also explore how transfer learning can be applied to learning these languages, given that they have the same origin and thus, are similar to some extend.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work presents our approach to train a neural network to detect
hate-speech texts in Hindi and Bengali. We also explore how transfer learning
can be applied to learning these languages, given that they have the same
origin and thus, are similar to some extend. Even though the whole experiment
was conducted with low computational power, the obtained result is comparable
to the results of other, more expensive, models. Furthermore, since the
training data in use is relatively small and the two languages are almost
entirely unknown to us, this work can be generalized as an effort to demystify
lost or alien languages that no human is capable of understanding.
Related papers
- On the Correspondence between Compositionality and Imitation in Emergent
Neural Communication [1.4610038284393165]
Our work explores the link between compositionality and imitation in a Lewis game played by deep neural agents.
supervised learning tends to produce more average languages, while reinforcement learning introduces a selection pressure toward more compositional languages.
arXiv Detail & Related papers (2023-05-22T11:41:29Z) - Communication Drives the Emergence of Language Universals in Neural
Agents: Evidence from the Word-order/Case-marking Trade-off [3.631024220680066]
We propose a new Neural-agent Language Learning and Communication framework (NeLLCom) where pairs of speaking and listening agents first learn a miniature language.
We succeed in replicating the trade-off with the new framework without hard-coding specific biases in the agents.
arXiv Detail & Related papers (2023-01-30T17:22:33Z) - Learning an Artificial Language for Knowledge-Sharing in Multilingual
Translation [15.32063273544696]
We discretize the latent space of multilingual models by assigning encoder states to entries in a codebook.
We validate our approach on large-scale experiments with realistic data volumes and domains.
We also use the learned artificial language to analyze model behavior, and discover that using a similar bridge language increases knowledge-sharing among the remaining languages.
arXiv Detail & Related papers (2022-11-02T17:14:42Z) - What Artificial Neural Networks Can Tell Us About Human Language
Acquisition [47.761188531404066]
Rapid progress in machine learning for natural language processing has the potential to transform debates about how humans learn language.
To increase the relevance of learnability results from computational models, we need to train model learners without significant advantages over humans.
arXiv Detail & Related papers (2022-08-17T00:12:37Z) - Same Neurons, Different Languages: Probing Morphosyntax in Multilingual
Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar.
We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z) - Hindi/Bengali Sentiment Analysis Using Transfer Learning and Joint Dual
Input Learning with Self Attention [0.0]
Our work explores how we can effectively use deep neural networks in transfer learning and joint dual input learning settings to effectively classify sentiments and detect hate speech in Hindi and Bengali data.
We use BiLSTM with self attention in joint dual input learning setting where we train a single neural network on Hindi and Bengali dataset simultaneously using their respective embeddings.
arXiv Detail & Related papers (2022-02-11T05:36:11Z) - Utilizing Wordnets for Cognate Detection among Indian Languages [50.83320088758705]
We detect cognate word pairs among ten Indian languages with Hindi.
We use deep learning methodologies to predict whether a word pair is cognate or not.
We report improved performance of up to 26%.
arXiv Detail & Related papers (2021-12-30T16:46:28Z) - Harnessing Cross-lingual Features to Improve Cognate Detection for
Low-resource Languages [50.82410844837726]
We demonstrate the use of cross-lingual word embeddings for detecting cognates among fourteen Indian languages.
We evaluate our methods to detect cognates on a challenging dataset of twelve Indian languages.
We observe an improvement of up to 18% points, in terms of F-score, for cognate detection.
arXiv Detail & Related papers (2021-12-16T11:17:58Z) - Knowledge Distillation for Multilingual Unsupervised Neural Machine
Translation [61.88012735215636]
Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs.
UNMT can only translate between a single language pair and cannot produce translation results for multiple language pairs at the same time.
In this paper, we empirically introduce a simple method to translate between thirteen languages using a single encoder and a single decoder.
arXiv Detail & Related papers (2020-04-21T17:26:16Z) - Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information.
We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.