Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as
a Target for NLP
- URL: http://arxiv.org/abs/2104.08620v1
- Date: Sat, 17 Apr 2021 18:54:00 GMT
- Title: Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as
a Target for NLP
- Authors: Josh Rozner, Christopher Potts, Kyle Mahowald
- Abstract summary: Cryptic crosswords are the dominant English-language crossword variety in the United Kingdom.
We present a dataset of cryptic crossword clues that can be used as a benchmark and train a sequence-to-sequence model to solve them.
We show that performance can be substantially improved using a novel curriculum learning approach.
- Score: 5.447716844779342
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cryptic crosswords, the dominant English-language crossword variety in the
United Kingdom, can be solved by expert humans using flexible, creative
intelligence and knowledge of language. Cryptic clues read like fluent natural
language, but they are adversarially composed of two parts: a definition and a
wordplay cipher requiring sub-word or character-level manipulations. As such,
they are a promising target for evaluating and advancing NLP systems that seek
to process language in more creative, human-like ways. We present a dataset of
cryptic crossword clues from a major newspaper that can be used as a benchmark
and train a sequence-to-sequence model to solve them. We also develop related
benchmarks that can guide development of approaches to this challenging task.
We show that performance can be substantially improved using a novel curriculum
learning approach in which the model is pre-trained on related tasks involving,
e.g, unscrambling words, before it is trained to solve cryptics. However, even
this curricular approach does not generalize to novel clue types in the way
that humans can, and so cryptic crosswords remain a challenge for NLP systems
and a potential source of future innovation.
Related papers
- Harnessing the Intrinsic Knowledge of Pretrained Language Models for Challenging Text Classification Settings [5.257719744958367]
This thesis explores three challenging settings in text classification by leveraging the intrinsic knowledge of pretrained language models (PLMs)
We develop models that utilize features based on contextualized word representations from PLMs, achieving performance that rivals or surpasses human accuracy.
Lastly, we tackle the sensitivity of large language models to in-context learning prompts by selecting effective demonstrations.
arXiv Detail & Related papers (2024-08-28T09:07:30Z) - A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding.
There is no publicly available NLI corpus for the Romanian language.
We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Italian Crossword Generator: Enhancing Education through Interactive
Word Puzzles [9.84767617576152]
We develop a comprehensive system for generating and verifying crossword clues.
A dataset of clue-answer pairs was compiled to fine-tune the models.
For generating crossword clues from a given text, Zero/Few-shot learning techniques were used.
arXiv Detail & Related papers (2023-11-27T11:17:29Z) - NERetrieve: Dataset for Next Generation Named Entity Recognition and
Retrieval [49.827932299460514]
We argue that capabilities provided by large language models are not the end of NER research, but rather an exciting beginning.
We present three variants of the NER task, together with a dataset to support them.
We provide a large, silver-annotated corpus of 4 million paragraphs covering 500 entity types.
arXiv Detail & Related papers (2023-10-22T12:23:00Z) - Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language
Pretraining? [34.609984453754656]
We aim to elucidate the impact of comprehensive linguistic knowledge, including semantic expression and syntactic structure, on multimodal alignment.
Specifically, we design and release the SNARE, the first large-scale multimodal alignment probing benchmark.
arXiv Detail & Related papers (2023-08-24T16:17:40Z) - Large Language Models are Fixated by Red Herrings: Exploring Creative
Problem Solving and Einstellung Effect using the Only Connect Wall Dataset [4.789429120223149]
The quest for human imitative AI has been an enduring topic in AI research since its inception.
Creative problem solving in humans is a well-studied topic in cognitive neuroscience.
Only Connect Wall segment essentially mimics Mednick's Remote Associates Test (RAT) formulation with built-in, deliberate red herrings.
arXiv Detail & Related papers (2023-06-19T21:14:57Z) - Pushing the Limits of ChatGPT on NLP Tasks [79.17291002710517]
Despite the success of ChatGPT, its performances on most NLP tasks are still well below the supervised baselines.
In this work, we looked into the causes, and discovered that its subpar performance was caused by the following factors.
We propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks.
arXiv Detail & Related papers (2023-06-16T09:40:05Z) - Language-Driven Representation Learning for Robotics [115.93273609767145]
Recent work in visual representation learning for robotics demonstrates the viability of learning from large video datasets of humans performing everyday tasks.
We introduce a framework for language-driven representation learning from human videos and captions.
We find that Voltron's language-driven learning outperform the prior-of-the-art, especially on targeted problems requiring higher-level control.
arXiv Detail & Related papers (2023-02-24T17:29:31Z) - Word Sense Induction with Hierarchical Clustering and Mutual Information
Maximization [14.997937028599255]
Word sense induction is a difficult problem in natural language processing.
We propose a novel unsupervised method based on hierarchical clustering and invariant information clustering.
We empirically demonstrate that, in certain cases, our approach outperforms prior WSI state-of-the-art methods.
arXiv Detail & Related papers (2022-10-11T13:04:06Z) - Modeling Target-Side Morphology in Neural Machine Translation: A
Comparison of Strategies [72.56158036639707]
Morphologically rich languages pose difficulties to machine translation.
A large amount of differently inflected word surface forms entails a larger vocabulary.
Some inflected forms of infrequent terms typically do not appear in the training corpus.
Linguistic agreement requires the system to correctly match the grammatical categories between inflected word forms in the output sentence.
arXiv Detail & Related papers (2022-03-25T10:13:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.