BiSync: A Bilingual Editor for Synchronized Monolingual Texts
- URL: http://arxiv.org/abs/2306.00400v1
- Date: Thu, 1 Jun 2023 07:03:47 GMT
- Title: BiSync: A Bilingual Editor for Synchronized Monolingual Texts
- Authors: Josep Crego, Jitao Xu, Fran\c{c}ois Yvon
- Abstract summary: We present BiSync, a bilingual writing assistant that allows users to freely compose text in two languages.
We detail the model architecture used for synchronization and evaluate the resulting tool, showing that high accuracy can be attained with limited computational resources.
- Score: 2.0411082897313984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In our globalized world, a growing number of situations arise where people
are required to communicate in one or several foreign languages. In the case of
written communication, users with a good command of a foreign language may find
assistance from computer-aided translation (CAT) technologies. These
technologies often allow users to access external resources, such as
dictionaries, terminologies or bilingual concordancers, thereby interrupting
and considerably hindering the writing process. In addition, CAT systems assume
that the source sentence is fixed and also restrict the possible changes on the
target side. In order to make the writing process smoother, we present BiSync,
a bilingual writing assistant that allows users to freely compose text in two
languages, while maintaining the two monolingual texts synchronized. We also
include additional functionalities, such as the display of alternative prefix
translations and paraphrases, which are intended to facilitate the authoring of
texts. We detail the model architecture used for synchronization and evaluate
the resulting tool, showing that high accuracy can be attained with limited
computational resources. The interface and models are publicly available at
https://github.com/jmcrego/BiSync and a demonstration video can be watched on
YouTube at https://youtu.be/_l-ugDHfNgU .
Related papers
- The Effect of Alignment Objectives on Code-Switching Translation [0.0]
We are proposing a way of training a single machine translation model that is able to translate monolingual sentences from one language to another.
This model can be considered a bilingual model in the human sense.
arXiv Detail & Related papers (2023-09-10T14:46:31Z) - Enhancing Cross-lingual Transfer via Phonemic Transcription Integration [57.109031654219294]
PhoneXL is a framework incorporating phonemic transcriptions as an additional linguistic modality for cross-lingual transfer.
Our pilot study reveals phonemic transcription provides essential information beyond the orthography to enhance cross-lingual transfer.
arXiv Detail & Related papers (2023-07-10T06:17:33Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Syntax-augmented Multilingual BERT for Cross-lingual Transfer [37.99210035238424]
This work shows that explicitly providing language syntax and training mBERT helps cross-lingual transfer.
Experiment results show that syntax-augmented mBERT improves cross-lingual transfer on popular benchmarks.
arXiv Detail & Related papers (2021-06-03T21:12:50Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Learning Contextualised Cross-lingual Word Embeddings and Alignments for
Extremely Low-Resource Languages Using Parallel Corpora [63.5286019659504]
We propose a new approach for learning contextualised cross-lingual word embeddings based on a small parallel corpus.
Our method obtains word embeddings via an LSTM encoder-decoder model that simultaneously translates and reconstructs an input sentence.
arXiv Detail & Related papers (2020-10-27T22:24:01Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.