Related papers: CubeTR: Learning to Solve The Rubiks Cube Using Transformers

CubeTR: Learning to Solve The Rubiks Cube Using Transformers

URL: http://arxiv.org/abs/2111.06036v2
Date: Sun, 29 Oct 2023 06:11:29 GMT
Title: CubeTR: Learning to Solve The Rubiks Cube Using Transformers
Authors: Mustafa Ebrahim Chasmai
Abstract summary: The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards. The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Since its first appearance, transformers have been successfully used in wide ranging domains from computer vision to natural language processing. Application of transformers in Reinforcement Learning by reformulating it as a sequence modelling problem was proposed only recently. Compared to other commonly explored reinforcement learning problems, the Rubiks cube poses a unique set of challenges. The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards. The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards. CubeTR learns how to solve the Rubiks cube from arbitrary starting states without any human prior, and after move regularisation, the lengths of solutions generated by it are expected to be very close to those given by algorithms used by expert human solvers. CubeTR provides insights to the generalisability of learning algorithms to higher dimensional cubes and the applicability of transformers in other relevant sparse reward scenarios.

Related papers

CubeRobot: Grounding Language in Rubik's Cube Manipulation via Vision-Language Model [1.644433638087587]
We introduce CubeRobot, a novel vision-language model (VLM) tailored for solving 3x3 Rubik's Cubes. We incorporate a dual-loop VisionCoT architecture and Memory Stream, a paradigm for extracting task-related features from VLM-generated planning queries. In low-level Rubik's Cube restoration tasks, CubeRobot achieved a high accuracy rate of 100%, similar to 100% in medium-level tasks, and achieved an accuracy rate of 80% in high-level tasks.
arXiv Detail & Related papers (2025-03-25T02:23:47Z)
Node Classification and Search on the Rubik's Cube Graph with GNNs [55.2480439325792]
This study focuses on the application of deep geometric models to solve the 3x3x3 Rubik's Rubik. We begin by discussing the cube's graph representation and defining distance as the model's optimization objective. The distance approximation task is reformulated as a node classification problem, effectively addressed using Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2025-01-30T18:52:43Z)
Solving Rubik's Cube Without Tricky Sampling [0.6445605125467574]
The Rubiks Cube, with its vast state space and sparse reward structure, presents a significant challenge for reinforcement learning. Previous research addressed this by propagating cost-to-go estimates from the solved state and incorporating search techniques. We introduce a novel RL algorithm using policy gradient methods to solve the Rubiks Cube without relying on near solved-state sampling.
arXiv Detail & Related papers (2024-11-29T09:56:40Z)
Solving a Rubik's Cube Using its Local Graph Structure [13.219469732742354]
A Rubix Cube has six faces and twelve possible actions, leading to a small and unconstrained action space. A Rubix Cube can be represented as a graph, where states of the cube are nodes and actions are edges. Drawing on graph convolutional networks, we design a new search algorithm to find the solution to a scrambled Rubix Cube.
arXiv Detail & Related papers (2024-08-15T05:39:52Z)
Ring Attention with Blockwise Transformers for Near-Infinite Context [88.61687950039662]
We present a novel approach, Ring Attention with Blockwise Transformers (Ring Attention), which leverages blockwise computation of self-attention and feedforward to distribute long sequences across multiple devices. Our approach enables training and inference of sequences that are up to device count times longer than those achievable by prior memory-efficient Transformers.
arXiv Detail & Related papers (2023-10-03T08:44:50Z)
The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks [59.26515696183751]
We show that algorithm discovery in neural networks is sometimes more complex. We show that even simple learning problems can admit a surprising diversity of solutions.
arXiv Detail & Related papers (2023-06-30T17:59:13Z)
Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning [0.0]
This work describes in detail how to learn and solve the Rubik's cube game (or puzzle) in the General Board Game (GBG) learning and playing framework. We describe the cube's state representation, how to transform it with twists, wholecube rotations and color transformations and explain the use of symmetries in Rubik's cube.
arXiv Detail & Related papers (2023-01-28T11:38:10Z)
Transformers Learn Shortcuts to Automata [52.015990420075944]
We find that a low-depth Transformer can represent the computations of any finite-state automaton. We show that a Transformer with $O(log T)$ layers can exactly replicate the computation of an automaton on an input sequence of length $T$. We further investigate the brittleness of these solutions and propose potential mitigations.
arXiv Detail & Related papers (2022-10-19T17:45:48Z)
Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles [67.39567701983357]
Video Anomaly Detection (VAD) is an important topic in computer vision. Motivated by the recent advances in self-supervised learning, this paper addresses VAD by solving an intuitive yet challenging pretext task. Our method outperforms state-of-the-art counterparts on three public benchmarks.
arXiv Detail & Related papers (2022-07-20T19:49:32Z)
Benchmarking Robot Manipulation with the Rubik's Cube [15.922643222904172]
We propose Rubik's cube manipulation as a benchmark to measure simultaneous performance of precise manipulation and sequential manipulation. We present a protocol for quantitatively measuring both the accuracy and speed of Rubik's cube manipulation. We demonstrate this protocol for two distinct baseline approaches on a PR2 robot.
arXiv Detail & Related papers (2022-02-14T22:34:18Z)
Discovering Non-monotonic Autoregressive Orderings with Variational Inference [67.27561153666211]
We develop an unsupervised parallelizable learner that discovers high-quality generation orders purely from training data. We implement the encoder as a Transformer with non-causal attention that outputs permutations in one forward pass. Empirical results in language modeling tasks demonstrate that our method is context-aware and discovers orderings that are competitive with or even better than fixed orders.
arXiv Detail & Related papers (2021-10-27T16:08:09Z)
Puzzle Solving without Search or Human Knowledge: An Unnatural Language Approach [0.0]
The application of Generative Pre-trained Transformer (GPT-2) to learn text-archived game notation provides a model environment for exploring sparse reward gameplay. The transformer architecture proves amenable to training on solved text archives describing mazes, Rubik's, and Sudoku solvers.
arXiv Detail & Related papers (2021-09-07T01:20:28Z)
Self-Supervision is All You Need for Solving Rubik's Cube [0.0]
This work introduces a simple and efficient deep learning method for solving problems with a predefined goal, represented by Rubik's Cube. We demonstrate that, for such problems, training a deep neural network on random scrambles branching from the goal state is sufficient to achieve near-optimal solutions.
arXiv Detail & Related papers (2021-06-06T15:38:50Z)
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups [115.58550697886987]
We provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before. Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems.
arXiv Detail & Related papers (2021-04-19T17:21:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.