A Reasoning-Based Approach to Cryptic Crossword Clue Solving
- URL: http://arxiv.org/abs/2506.04824v1
- Date: Thu, 05 Jun 2025 09:43:28 GMT
- Title: A Reasoning-Based Approach to Cryptic Crossword Clue Solving
- Authors: Martin Andrews, Sam Witteveen,
- Abstract summary: This work describes an LLM-based reasoning system built from open-licensed components.<n>It solves cryptic clues by (i) hypothesising answers; (ii) proposing wordplay explanations; and (iii) using a verifier system that operates on codified reasoning steps.<n>Overall, this system establishes a new state-of-the-art performance on the challenging Cryptonite dataset of clues from The Times and The Telegraph newspapers in the UK.
- Score: 0.18416014644193066
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Cryptic crossword clues are challenging language tasks for which new test sets are released daily by major newspapers on a global basis. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and 'wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words as confirmation). This work describes an LLM-based reasoning system built from open-licensed components that solves cryptic clues by (i) hypothesising answers; (ii) proposing wordplay explanations; and (iii) using a verifier system that operates on codified reasoning steps. Overall, this system establishes a new state-of-the-art performance on the challenging Cryptonite dataset of clues from The Times and The Telegraph newspapers in the UK. Because each proved solution is expressed in Python, interpretable wordplay reasoning for proven answers is available for inspection.
Related papers
- Logic-of-Thought: Empowering Large Language Models with Logic Programs for Solving Puzzles in Natural Language [67.51318974970985]
Solving puzzles in natural language poses a long-standing challenge in AI.<n>We propose Logic-of-Thought, a framework that bridges large language models with logic programming.<n>We evaluate our method on various grid puzzles and dynamic puzzles involving actions, demonstrating near-perfect accuracy across all tasks.
arXiv Detail & Related papers (2025-05-22T01:37:40Z) - Proving that Cryptic Crossword Clue Answers are Correct [0.18416014644193066]
We show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay works'
We show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay works'
arXiv Detail & Related papers (2024-07-11T19:13:16Z) - Language Models are Crossword Solvers [1.53744306569115]
We tackle the challenge of solving crosswords with large language models (LLMs)<n>We demonstrate that the current generation of language models shows significant competence at deciphering cryptic crossword clues.<n>We also develop a search algorithm that builds off this performance to tackle the problem of solving full crossword grids with out-of-the-box LLMs.
arXiv Detail & Related papers (2024-06-13T12:29:27Z) - Language Models can be Logical Solvers [99.40649402395725]
We introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers.
LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers.
arXiv Detail & Related papers (2023-11-10T16:23:50Z) - Multi-grained Evidence Inference for Multi-choice Reading Comprehension [62.0773160298008]
Multi-choice Machine Reading (MRC) is a major and challenging task for machines to answer questions according to provided options.
We propose a novel general-purpose model enhancement which integrates multi-grained evidence comprehensively, named Multi-grained evidence inferencer (Mugen)
Mugen extracts three different granularities of evidence, and integrates evidence with the original passages, achieving significant and consistent performance improvement on four multi-choice MRC benchmarks.
arXiv Detail & Related papers (2023-10-27T11:36:18Z) - Re-Reading Improves Reasoning in Large Language Models [87.46256176508376]
We introduce a simple, yet general and effective prompting method, Re2, to enhance the reasoning capabilities of off-the-shelf Large Language Models (LLMs)
Unlike most thought-eliciting prompting methods, such as Chain-of-Thought (CoT), Re2 shifts the focus to the input by processing questions twice, thereby enhancing the understanding process.
We evaluate Re2 on extensive reasoning benchmarks across 14 datasets, spanning 112 experiments, to validate its effectiveness and generality.
arXiv Detail & Related papers (2023-09-12T14:36:23Z) - Down and Across: Introducing Crossword-Solving as a New NLP Benchmark [11.194615436370507]
We release the specification of a corpus of crossword puzzles collected from the New York Times daily crossword spanning 25 years.
These puzzles include a diverse set of clues: historic, factual, word meaning, synonyms/antonyms, fill-in-the-blank, abbreviations, prefixes/suffixes, wordplay, and cross-lingual.
arXiv Detail & Related papers (2022-05-20T21:16:44Z) - Automated Crossword Solving [38.36920665368784]
Our system improves exact puzzle accuracy from 57% to 82% on crosswords from The New York Times.
Our system also won first place at the top human crossword tournament.
arXiv Detail & Related papers (2022-05-19T16:28:44Z) - Exploiting Reasoning Chains for Multi-hop Science Question Answering [51.86289192292466]
Our framework is capable of performing explainable reasoning without the need of any corpus-specific annotations.
A textitChain-aware loss, concerning both local and global chain information, is also designed to enable the generated chains to serve as distant supervision signals.
arXiv Detail & Related papers (2021-09-07T07:22:07Z) - Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP [28.479149974110463]
Cryptic crosswords, the dominant crossword variety in the UK, are a promising target for advancing NLP systems.<n>We present a dataset of cryptic clues as a challenging new benchmark for NLP systems.<n>We also introduce a challenging data split, examine the meta-linguistic capabilities of subword-tokenized models, and investigate model systematicity by perturbing the wordplay part of clues.
arXiv Detail & Related papers (2021-04-17T18:54:00Z) - Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems [54.49880724137688]
The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
arXiv Detail & Related papers (2020-03-19T21:24:45Z) - Retrospective Reader for Machine Reading Comprehension [90.6069071495214]
Machine reading comprehension (MRC) is an AI challenge that requires machine to determine the correct answers to questions based on a given passage.
When unanswerable questions are involved in the MRC task, an essential verification module called verifier is especially required in addition to the encoder.
This paper devotes itself to exploring better verifier design for the MRC task with unanswerable questions.
arXiv Detail & Related papers (2020-01-27T11:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.