Proving that Cryptic Crossword Clue Answers are Correct
- URL: http://arxiv.org/abs/2407.08824v1
- Date: Thu, 11 Jul 2024 19:13:16 GMT
- Title: Proving that Cryptic Crossword Clue Answers are Correct
- Authors: Martin Andrews, Sam Witteveen,
- Abstract summary: We show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay works'
We show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay works'
- Score: 0.18416014644193066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cryptic crossword clues are challenging cognitive tasks, for which new test sets are released on a daily basis by multiple international newspapers. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and `wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words to confirm it). Using an existing cryptic wordplay proving framework (operating on Python proofs created by an LLM), we show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay `works'.
Related papers
- Language Models are Crossword Solvers [1.53744306569115]
We tackle the challenge of solving crosswords with Large Language Models (LLMs)
We demonstrate that the current generation of state-of-the art (SoTA) language models show significant competence at deciphering cryptic crossword clues.
We also develop a search algorithm that builds off this performance to tackle the problem of solving full crossword grids with LLMs.
arXiv Detail & Related papers (2024-06-13T12:29:27Z) - Provably Secure Disambiguating Neural Linguistic Steganography [66.30965740387047]
The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures.
We propose a novel secure disambiguation method named SyncPool, which effectively addresses the segmentation ambiguity problem.
SyncPool does not change the size of the candidate pool or the distribution of tokens and thus is applicable to provably secure language steganography methods.
arXiv Detail & Related papers (2024-03-26T09:25:57Z) - Deceptive Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination? [73.454943870226]
This work studies a specific type of hallucination induced by semantic associations.
To quantify this phenomenon, we propose a novel probing method and benchmark called EureQA.
arXiv Detail & Related papers (2023-11-16T09:27:36Z) - Multi-grained Evidence Inference for Multi-choice Reading Comprehension [62.0773160298008]
Multi-choice Machine Reading (MRC) is a major and challenging task for machines to answer questions according to provided options.
We propose a novel general-purpose model enhancement which integrates multi-grained evidence comprehensively, named Multi-grained evidence inferencer (Mugen)
Mugen extracts three different granularities of evidence, and integrates evidence with the original passages, achieving significant and consistent performance improvement on four multi-choice MRC benchmarks.
arXiv Detail & Related papers (2023-10-27T11:36:18Z) - Down and Across: Introducing Crossword-Solving as a New NLP Benchmark [11.194615436370507]
We release the specification of a corpus of crossword puzzles collected from the New York Times daily crossword spanning 25 years.
These puzzles include a diverse set of clues: historic, factual, word meaning, synonyms/antonyms, fill-in-the-blank, abbreviations, prefixes/suffixes, wordplay, and cross-lingual.
arXiv Detail & Related papers (2022-05-20T21:16:44Z) - Automated Crossword Solving [38.36920665368784]
Our system improves exact puzzle accuracy from 57% to 82% on crosswords from The New York Times.
Our system also won first place at the top human crossword tournament.
arXiv Detail & Related papers (2022-05-19T16:28:44Z) - DialFact: A Benchmark for Fact-Checking in Dialogue [56.63709206232572]
We construct DialFact, a benchmark dataset of 22,245 annotated conversational claims, paired with pieces of evidence from Wikipedia.
We find that existing fact-checking models trained on non-dialogue data like FEVER fail to perform well on our task.
We propose a simple yet data-efficient solution to effectively improve fact-checking performance in dialogue.
arXiv Detail & Related papers (2021-10-15T17:34:35Z) - Spell my name: keyword boosted speech recognition [25.931897154065663]
uncommon words such as names and technical terminology are important to understanding conversations in context.
We propose a simple but powerful ASR decoding method that can better recognise these uncommon keywords.
The method boosts the probabilities of given keywords in a beam search based on acoustic model predictions.
We demonstrate the effectiveness of our method on the LibriSpeeech test sets and also internal data of real-world conversations.
arXiv Detail & Related papers (2021-10-06T14:16:57Z) - Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as
a Target for NLP [5.447716844779342]
Cryptic crosswords are the dominant English-language crossword variety in the United Kingdom.
We present a dataset of cryptic crossword clues that can be used as a benchmark and train a sequence-to-sequence model to solve them.
We show that performance can be substantially improved using a novel curriculum learning approach.
arXiv Detail & Related papers (2021-04-17T18:54:00Z) - Crossing Variational Autoencoders for Answer Retrieval [50.17311961755684]
Question-answer alignment and question/answer semantics are two important signals for learning the representations.
We propose to cross variational auto-encoders by generating questions with aligned answers and generating answers with aligned questions.
arXiv Detail & Related papers (2020-05-06T01:59:13Z) - Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems [54.49880724137688]
The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
arXiv Detail & Related papers (2020-03-19T21:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.