Semantically Informed Slang Interpretation
- URL: http://arxiv.org/abs/2205.00616v1
- Date: Mon, 2 May 2022 01:51:49 GMT
- Title: Semantically Informed Slang Interpretation
- Authors: Zhewei Sun, Richard Zemel, Yang Xu
- Abstract summary: We propose a semantically informed slang interpretation (SSI) framework that considers jointly the contextual and semantic appropriateness of a candidate interpretation for a query slang.
We show how the same framework can be applied to enhancing machine translation of slang from English to other languages.
- Score: 2.9097456604613745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Slang is a predominant form of informal language making flexible and extended
use of words that is notoriously hard for natural language processing systems
to interpret. Existing approaches to slang interpretation tend to rely on
context but ignore semantic extensions common in slang word usage. We propose a
semantically informed slang interpretation (SSI) framework that considers
jointly the contextual and semantic appropriateness of a candidate
interpretation for a query slang. We perform rigorous evaluation on two
large-scale online slang dictionaries and show that our approach not only
achieves state-of-the-art accuracy for slang interpretation in English, but
also does so in zero-shot and few-shot scenarios where training data is sparse.
Furthermore, we show how the same framework can be applied to enhancing machine
translation of slang from English to other languages. Our work creates
opportunities for the automated interpretation and translation of informal
language.
Related papers
- That was the last straw, we need more: Are Translation Systems Sensitive
to Disambiguating Context? [64.38544995251642]
We study semantic ambiguities that exist in the source (English in this work) itself.
We focus on idioms that are open to both literal and figurative interpretations.
We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation.
arXiv Detail & Related papers (2023-10-23T06:38:49Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - A Study of Slang Representation Methods [3.511369967593153]
We study different combinations of representation learning models and knowledge resources for a variety of downstream tasks that rely on slang understanding.
Our error analysis identifies core challenges for slang representation learning, including out-of-vocabulary words, polysemy, variance, and annotation disagreements.
arXiv Detail & Related papers (2022-12-11T21:56:44Z) - Tracing Semantic Variation in Slang [3.437479039185694]
Slang semantic variation is not well understood and under-explored in the natural language processing of slang.
One existing view argues that slang semantic variation is driven by culture-dependent communicative needs.
An alternative view focuses on slang's social functions suggesting that the desire to foster semantic distinction may have led to the historical emergence of community-specific slang senses.
arXiv Detail & Related papers (2022-10-16T20:51:14Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - UAlberta at SemEval 2022 Task 2: Leveraging Glosses and Translations for
Multilingual Idiomaticity Detection [4.66831886752751]
We describe the University of Alberta systems for the SemEval-2022 Task 2 on multilingual idiomaticity detection.
Under the assumption that idiomatic expressions are noncompositional, our first method integrates information on the meanings of the individual words of an expression into a binary classifier.
Our second method translates an expression in context, and uses a lexical knowledge base to determine if the translation is literal.
arXiv Detail & Related papers (2022-05-27T16:35:00Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning.
We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.
We find that assertions enable semantic emulation if all expressions in the language are referentially transparent.
However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z) - A Computational Framework for Slang Generation [2.1813490315521773]
We take an initial step toward machine generation of slang by developing a framework that models the speaker's word choice in slang context.
Our framework encodes novel slang meaning by relating the conventional and slang senses of a word.
We perform rigorous evaluations on three slang dictionaries and show that our approach outperforms state-of-the-art language models.
arXiv Detail & Related papers (2021-02-03T01:19:07Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.