Down and Across: Introducing Crossword-Solving as a New NLP Benchmark
- URL: http://arxiv.org/abs/2205.10442v1
- Date: Fri, 20 May 2022 21:16:44 GMT
- Title: Down and Across: Introducing Crossword-Solving as a New NLP Benchmark
- Authors: Saurabh Kulshreshtha, Olga Kovaleva, Namrata Shivagunde, Anna
Rumshisky
- Abstract summary: We release the specification of a corpus of crossword puzzles collected from the New York Times daily crossword spanning 25 years.
These puzzles include a diverse set of clues: historic, factual, word meaning, synonyms/antonyms, fill-in-the-blank, abbreviations, prefixes/suffixes, wordplay, and cross-lingual.
- Score: 11.194615436370507
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Solving crossword puzzles requires diverse reasoning capabilities, access to
a vast amount of knowledge about language and the world, and the ability to
satisfy the constraints imposed by the structure of the puzzle. In this work,
we introduce solving crossword puzzles as a new natural language understanding
task. We release the specification of a corpus of crossword puzzles collected
from the New York Times daily crossword spanning 25 years and comprised of a
total of around nine thousand puzzles. These puzzles include a diverse set of
clues: historic, factual, word meaning, synonyms/antonyms, fill-in-the-blank,
abbreviations, prefixes/suffixes, wordplay, and cross-lingual, as well as clues
that depend on the answers to other clues. We separately release the
clue-answer pairs from these puzzles as an open-domain question answering
dataset containing over half a million unique clue-answer pairs. For the
question answering task, our baselines include several sequence-to-sequence and
retrieval-based generative models. We also introduce a non-parametric
constraint satisfaction baseline for solving the entire crossword puzzle.
Finally, we propose an evaluation framework which consists of several
complementary performance metrics.
Related papers
- Language Models are Crossword Solvers [1.53744306569115]
We tackle the challenge of solving crosswords with Large Language Models (LLMs)
We demonstrate that the current generation of state-of-the art (SoTA) language models show significant competence at deciphering cryptic crossword clues.
We also develop a search algorithm that builds off this performance to tackle the problem of solving full crossword grids with LLMs.
arXiv Detail & Related papers (2024-06-13T12:29:27Z) - Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction [73.26364649572237]
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world.
A large number of Oracle Bone Inscriptions (OBI) remain undeciphered, making it one of the global challenges in paleography today.
This paper introduces a novel approach, namely Puzzle Pieces Picker (P$3$), to decipher these enigmatic characters through radical reconstruction.
arXiv Detail & Related papers (2024-06-05T07:34:39Z) - Clue-Instruct: Text-Based Clue Generation for Educational Crossword Puzzles [10.375451846093327]
We propose a methodology to build educational clue generation datasets that can be used to instruct Large Language Models.
By gathering from Wikipedia pages informative content associated with relevant keywords, we use Large Language Models to automatically generate pedagogical clues.
We used clue-instruct to instruct different LLMs to generate educational clues from a given input content and keyword.
arXiv Detail & Related papers (2024-04-09T10:12:34Z) - Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious
Challenges in Multimodal Reasoning [24.386388107656334]
This paper introduces the novel task of multimodal puzzle solving, framed within the context of visual question-answering.
We present a new dataset, AlgoVQA, designed to challenge and evaluate the capabilities of multimodal language models in solving algorithmic puzzles.
arXiv Detail & Related papers (2024-03-06T17:15:04Z) - Tree of Thoughts: Deliberate Problem Solving with Large Language Models [52.31950122881687]
We introduce a new framework for language model inference, Tree of Thoughts (ToT)
ToT generalizes over the popular Chain of Thought approach to prompting language models.
Our experiments show that ToT significantly enhances language models' problem-solving abilities.
arXiv Detail & Related papers (2023-05-17T23:16:17Z) - Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving [73.58829980121767]
We present a novel method for solving square jigsaw puzzles based on global optimization.
The method is fully automatic, assumes no prior information, and can handle puzzles with known or unknown piece orientation.
arXiv Detail & Related papers (2023-03-26T18:53:51Z) - Automated Graph Genetic Algorithm based Puzzle Validation for Faster
Game Desig [69.02688684221265]
This paper presents an evolutionary algorithm, empowered by expert-knowledge informeds, for solving logical puzzles in video games efficiently.
We discuss multiple variations of hybrid genetic approaches for constraint satisfaction problems that allow us to find a diverse set of near-optimal solutions for puzzles.
arXiv Detail & Related papers (2023-02-17T18:15:33Z) - Are Deep Neural Networks SMARTer than Second Graders? [85.60342335636341]
We evaluate the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed for children in the 6--8 age group.
Our dataset consists of 101 unique puzzles; each puzzle comprises a picture question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning.
Experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization.
arXiv Detail & Related papers (2022-12-20T04:33:32Z) - Automated Crossword Solving [38.36920665368784]
Our system improves exact puzzle accuracy from 57% to 82% on crosswords from The New York Times.
Our system also won first place at the top human crossword tournament.
arXiv Detail & Related papers (2022-05-19T16:28:44Z) - Pictorial and apictorial polygonal jigsaw puzzles: The lazy caterer
model, properties, and solvers [14.08706290287121]
We formalize a new type of jigsaw puzzle where the pieces are general convex polygons generated by cutting through a global polygonal shape/image with an arbitrary number of straight cuts.
We analyze the theoretical properties of such puzzles, including the inherent challenges in solving them once pieces are contaminated with geometrical noise.
arXiv Detail & Related papers (2020-08-17T22:07:40Z) - PuzzLing Machines: A Challenge on Learning From Small Data [64.513459448362]
We introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students.
Our challenge contains around 100 puzzles covering a wide range of linguistic phenomena from 81 languages.
We show that both simple statistical algorithms and state-of-the-art deep neural models perform inadequately on this challenge, as expected.
arXiv Detail & Related papers (2020-04-27T20:34:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.