A Computational Framework for Slang Generation
- URL: http://arxiv.org/abs/2102.01826v1
- Date: Wed, 3 Feb 2021 01:19:07 GMT
- Title: A Computational Framework for Slang Generation
- Authors: Zhewei Sun, Richard Zemel, Yang Xu
- Abstract summary: We take an initial step toward machine generation of slang by developing a framework that models the speaker's word choice in slang context.
Our framework encodes novel slang meaning by relating the conventional and slang senses of a word.
We perform rigorous evaluations on three slang dictionaries and show that our approach outperforms state-of-the-art language models.
- Score: 2.1813490315521773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Slang is a common type of informal language, but its flexible nature and
paucity of data resources present challenges for existing natural language
systems. We take an initial step toward machine generation of slang by
developing a framework that models the speaker's word choice in slang context.
Our framework encodes novel slang meaning by relating the conventional and
slang senses of a word while incorporating syntactic and contextual knowledge
in slang usage. We construct the framework using a combination of probabilistic
inference and neural contrastive learning. We perform rigorous evaluations on
three slang dictionaries and show that our approach not only outperforms
state-of-the-art language models, but also better predicts the historical
emergence of slang word usages from 1960s to 2000s. We interpret the proposed
models and find that the contrastively learned semantic space is sensitive to
the similarities between slang and conventional senses of words. Our work
creates opportunities for the automated generation and interpretation of
informal language.
Related papers
- A Study of Slang Representation Methods [3.511369967593153]
We study different combinations of representation learning models and knowledge resources for a variety of downstream tasks that rely on slang understanding.
Our error analysis identifies core challenges for slang representation learning, including out-of-vocabulary words, polysemy, variance, and annotation disagreements.
arXiv Detail & Related papers (2022-12-11T21:56:44Z) - Tracing Semantic Variation in Slang [3.437479039185694]
Slang semantic variation is not well understood and under-explored in the natural language processing of slang.
One existing view argues that slang semantic variation is driven by culture-dependent communicative needs.
An alternative view focuses on slang's social functions suggesting that the desire to foster semantic distinction may have led to the historical emergence of community-specific slang senses.
arXiv Detail & Related papers (2022-10-16T20:51:14Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - The Whole Truth and Nothing But the Truth: Faithful and Controllable
Dialogue Response Generation with Dataflow Transduction and Constrained
Decoding [65.34601470417967]
We describe a hybrid architecture for dialogue response generation that combines the strengths of neural language modeling and rule-based generation.
Our experiments show that this system outperforms both rule-based and learned approaches in human evaluations of fluency, relevance, and truthfulness.
arXiv Detail & Related papers (2022-09-16T09:00:49Z) - Noun2Verb: Probabilistic frame semantics for word class conversion [8.939269057094661]
We present a formal framework that simulates the production and comprehension of novel denominal verb usages.
We show that a model where the speaker and listener cooperatively learn the joint distribution over semantic frame elements better explains the empirical denominal verb usages.
arXiv Detail & Related papers (2022-05-12T19:16:12Z) - Semantically Informed Slang Interpretation [2.9097456604613745]
We propose a semantically informed slang interpretation (SSI) framework that considers jointly the contextual and semantic appropriateness of a candidate interpretation for a query slang.
We show how the same framework can be applied to enhancing machine translation of slang from English to other languages.
arXiv Detail & Related papers (2022-05-02T01:51:49Z) - Lexical semantic change for Ancient Greek and Latin [61.69697586178796]
Associating a word's correct meaning in its historical context is a central challenge in diachronic research.
We build on a recent computational approach to semantic change based on a dynamic Bayesian mixture model.
We provide a systematic comparison of dynamic Bayesian mixture models for semantic change with state-of-the-art embedding-based models.
arXiv Detail & Related papers (2021-01-22T12:04:08Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Knowledge Injection into Dialogue Generation via Language Models [85.65843021510521]
InjK is a two-stage approach to inject knowledge into a dialogue generation model.
First, we train a large-scale language model and query it as textual knowledge.
Second, we frame a dialogue generation model to sequentially generate textual knowledge and a corresponding response.
arXiv Detail & Related papers (2020-04-30T07:31:24Z) - Word Sense Disambiguation for 158 Languages using Word Embeddings Only [80.79437083582643]
Disambiguation of word senses in context is easy for humans, but a major challenge for automatic approaches.
We present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory.
We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings.
arXiv Detail & Related papers (2020-03-14T14:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.