Related papers: The boundaries of meaning: a case study in neural machine translation

The boundaries of meaning: a case study in neural machine translation

URL: http://arxiv.org/abs/2210.00613v1
Date: Sun, 2 Oct 2022 20:26:20 GMT
Title: The boundaries of meaning: a case study in neural machine translation
Authors: Yuri Balashov
Abstract summary: Subword segmentation algorithms are widely employed in language modeling, machine translation, and other tasks since 2016. These algorithms often cut words into semantically opaque pieces, such as 'period', 'on', 't', and 'ist'
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The success of deep learning in natural language processing raises intriguing questions about the nature of linguistic meaning and ways in which it can be processed by natural and artificial systems. One such question has to do with subword segmentation algorithms widely employed in language modeling, machine translation, and other tasks since 2016. These algorithms often cut words into semantically opaque pieces, such as 'period', 'on', 't', and 'ist' in 'period|on|t|ist'. The system then represents the resulting segments in a dense vector space, which is expected to model grammatical relations among them. This representation may in turn be used to map 'period|on|t|ist' (English) to 'par|od|ont|iste' (French). Thus, instead of being modeled at the lexical level, translation is reformulated more generally as the task of learning the best bilingual mapping between the sequences of subword segments of two languages; and sometimes even between pure character sequences: 'p|e|r|i|o|d|o|n|t|i|s|t' $\rightarrow$ 'p|a|r|o|d|o|n|t|i|s|t|e'. Such subword segmentations and alignments are at work in highly efficient end-to-end machine translation systems, despite their allegedly opaque nature. The computational value of such processes is unquestionable. But do they have any linguistic or philosophical plausibility? I attempt to cast light on this question by reviewing the relevant details of the subword segmentation algorithms and by relating them to important philosophical and linguistic debates, in the spirit of making artificial intelligence more transparent and explainable.

Related papers

Density Measures for Language Generation [2.2872032473279065]
We study the trade-off between validity and breadth of language generation algorithms. Existing algorithms for language generation in the limit produce output sets that can have zero density in the true language. We show, however, that we provide an algorithm for language generation in the limit whose outputs have strictly positive density in $K$.
arXiv Detail & Related papers (2025-04-19T18:08:18Z)
Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers. It is common to instead use proxy tasks that are similar in only an informal sense. We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z)
Word class representations spontaneously emerge in a deep neural network trained on next word prediction [7.240611820374677]
How do humans learn language, and can the first language be learned at all? These fundamental questions are still hotly debated. In particular, we train an artificial deep neural network on predicting the next word. We find that the internal representations of nine-word input sequences cluster according to the word class of the tenth word to be predicted as output.
arXiv Detail & Related papers (2023-02-15T11:02:50Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Context based lemmatizer for Polish language [0.0]
Lemmatization is the process of grouping together the inflected forms of a word so they can be analysed as a single item. In computational linguistics, lemmatisation is the algorithmic process of determining the lemma of a word based on its intended meaning. The model achieves the best results for polish language lemmatisation process.
arXiv Detail & Related papers (2022-07-23T18:02:16Z)
A Paradigm Change for Formal Syntax: Computational Algorithms in the Grammar of English [0.0]
We turn to programming languages as models for a process-based syntax of English. The combination of a functional word and a content word was chosen as the topic of modeling. The fit of the model was tested by deriving three functional characteristics crucial for the algorithm and checking their presence in English grammar.
arXiv Detail & Related papers (2022-05-24T07:28:47Z)
Generalized Optimal Linear Orders [9.010643838773477]
The sequential structure of language, and the order of words in a sentence specifically, plays a central role in human language processing. In designing computational models of language, the de facto approach is to present sentences to machines with the words ordered in the same order as in the original human-authored sentence. The very essence of this work is to question the implicit assumption that this is desirable and inject theoretical soundness into the consideration of word order in natural language processing.
arXiv Detail & Related papers (2021-08-13T13:10:15Z)
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z)
Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models. We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
Intrinsic Probing through Dimension Selection [69.52439198455438]
Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks. Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it. In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted.
arXiv Detail & Related papers (2020-10-06T15:21:08Z)
Information-Theoretic Probing for Linguistic Structure [74.04862204427944]
We propose an information-theoretic operationalization of probing as estimating mutual information. We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research.
arXiv Detail & Related papers (2020-04-07T01:06:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.