Related papers: Media of Langue: The dictionary that visualizes Inter-Lingual Semantic Network/Space

Media of Langue: The dictionary that visualizes Inter-Lingual Semantic Network/Space

URL: http://arxiv.org/abs/2309.08609v3
Date: Sat, 27 Jan 2024 09:08:08 GMT
Title: Media of Langue: The dictionary that visualizes Inter-Lingual Semantic Network/Space
Authors: Goki Muramoto, Atsuki Sato, Takayoshi Koyama
Abstract summary: "Media of Langue" is a novel dictionary visualizing Inter-lingual semantic network/space. By visualizing this network/space for humans, an Inter-lingual dictionary can be realized that points to the semantic place of many words at once.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces "Media of Langue," a novel dictionary visualizing Inter-lingual semantic network/space. Our proposed Inter-lingual semantic network/space is formed solely from the accumulation of translation practices between two or more language systems, in contrast to existing semantic networks/spaces that explicitly use "intra"-lingual relations. By visualizing this network/space for humans, an Inter-lingual dictionary can be realized that points to the semantic place of many words at once with a chain of mutual translation, which also contains the functions of existing dictionaries such as bilingual and synonym dictionaries. We implemented and published this interface as a web application, focusing on seven language pairs. In this paper, we first describe Inter-lingual semantic network/space with its basic features and the way to develop it from bilingual corpora, then details the design of "Media of Langue," with a quick analysis and illustrative examples of use cases. Our website is www.media-of-langue.org. A demonstration video is available at https://youtu.be/98lXuX4yjsU.

Related papers

Topology of Syntax Networks across Languages [0.0]
This thesis investigates the structure and properties of syntax networks. It will try to find clusters/phylogenies of languages that share similar network features. Results across different languages will also be compared in an attempt to discover universally preserved structural patterns.
arXiv Detail & Related papers (2025-03-09T18:47:17Z)
Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs) We form "semantic tokens" by merging the semantically similar subwords and their embeddings. inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z)
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages [55.157295899188476]
neural machine translation systems learn to map sentences of different languages into a common representation space. In this work, we test this hypothesis by zero-shot translating from unseen languages. We demonstrate that this setup enables zero-shot translation from entirely unseen languages.
arXiv Detail & Related papers (2024-08-05T07:58:58Z)
Wav2Gloss: Generating Interlinear Glossed Text from Speech [78.64412090339044]
We propose Wav2Gloss, a task in which four linguistic annotation components are extracted automatically from speech. We provide various baselines to lay the groundwork for future research on Interlinear Glossed Text generation from speech.
arXiv Detail & Related papers (2024-03-19T21:45:29Z)
Deep Emotions Across Languages: A Novel Approach for Sentiment Propagation in Multilingual WordNets [4.532887563053358]
This paper introduces two new techniques for automatically propagating sentiment annotations from a partially annotated WordNet to its entirety and to a WordNet in a different language. We evaluated the proposed MSSE+CLDNS method extensively using Princeton WordNet and Polish WordNet, which have many inter-lingual relations. Our results show that the MSSE+CLDNS method outperforms existing propagation methods, indicating its effectiveness in enriching WordNets with emotional metadata across multiple languages.
arXiv Detail & Related papers (2023-12-07T21:44:14Z)
Automatically constructing Wordnet synsets [2.363388546004777]
We propose approaches to generate Wordnet synsets for languages both resource-rich and resource-poor. Our algorithms translate synsets of existing Wordnets to a target language T, then apply a ranking method on the translation candidates to find best translations in T.
arXiv Detail & Related papers (2022-08-08T02:02:18Z)
TeKo: Text-Rich Graph Neural Networks with External Knowledge [75.91477450060808]
We propose a novel text-rich graph neural network with external knowledge (TeKo) We first present a flexible heterogeneous semantic network that incorporates high-quality entities. We then introduce two types of external knowledge, that is, structured triplets and unstructured entity description.
arXiv Detail & Related papers (2022-06-15T02:33:10Z)
Discovering Language-neutral Sub-networks in Multilingual Language Models [15.94622051535847]
Language neutrality of multilingual models is a function of the overlap between language-encoding sub-networks of these models. Using mBERT as a foundation, we employ the lottery ticket hypothesis to discover sub-networks that are individually optimized for various languages and tasks. We conclude that mBERT is comprised of a language-neutral sub-network shared among many languages, along with multiple ancillary language-specific sub-networks.
arXiv Detail & Related papers (2022-05-25T11:35:41Z)
Building the Language Resource for a Cebuano-Filipino Neural Machine Translation System [0.0]
We present the efforts made to build a parallel corpus for Cebuano and Filipino from two different domains: biblical texts and the web. For the biblical resource, subword unit translation for verbs and copy-able approach for nouns were applied to correct inconsistencies in the translation. For Wikipedia, commonly occurring topic segments were extracted from both the source and the target languages.
arXiv Detail & Related papers (2021-10-05T23:03:09Z)
Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus. Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z)
On the Effects of Using word2vec Representations in Neural Networks for Dialogue Act Recognition [0.6767885381740952]
We propose a new deep neural network that explores recurrent models to capture word sequences within sentences. We validate this model on three languages: English, French and Czech.
arXiv Detail & Related papers (2020-10-22T07:21:17Z)
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source. We observe that our representations embed typology and strengthen correlations with language relationships. We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.