Related papers: Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning

Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning

URL: http://arxiv.org/abs/2301.12167v1
Date: Sat, 28 Jan 2023 11:38:10 GMT
Title: Towards Learning Rubik's Cube with N-tuple-based Reinforcement Learning
Authors: Wolfgang Konen
Abstract summary: This work describes in detail how to learn and solve the Rubik's cube game (or puzzle) in the General Board Game (GBG) learning and playing framework. We describe the cube's state representation, how to transform it with twists, wholecube rotations and color transformations and explain the use of symmetries in Rubik's cube.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This work describes in detail how to learn and solve the Rubik's cube game (or puzzle) in the General Board Game (GBG) learning and playing framework. We cover the cube sizes 2x2x2 and 3x3x3. We describe in detail the cube's state representation, how to transform it with twists, whole-cube rotations and color transformations and explain the use of symmetries in Rubik's cube. Next, we discuss different n-tuple representations for the cube, how we train the agents by reinforcement learning and how we improve the trained agents during evaluation by MCTS wrapping. We present results for agents that learn Rubik's cube from scratch, with and without MCTS wrapping, with and without symmetries and show that both, MCTS wrapping and symmetries, increase computational costs, but lead at the same time to much better results. We can solve the 2x2x2 cube completely, and the 3x3x3 cube in the majority of the cases for scrambled cubes up to p = 15 (QTM). We cannot yet reliably solve 3x3x3 cubes with more than 15 scrambling twists. Although our computational costs are higher with MCTS wrapping and with symmetries than without, they are still considerably lower than in the approaches of McAleer et al. (2018, 2019) and Agostinelli et al. (2019) who provide the best Rubik's cube learning agents so far.

Related papers

CubeRobot: Grounding Language in Rubik's Cube Manipulation via Vision-Language Model [1.644433638087587]
We introduce CubeRobot, a novel vision-language model (VLM) tailored for solving 3x3 Rubik's Cubes. We incorporate a dual-loop VisionCoT architecture and Memory Stream, a paradigm for extracting task-related features from VLM-generated planning queries. In low-level Rubik's Cube restoration tasks, CubeRobot achieved a high accuracy rate of 100%, similar to 100% in medium-level tasks, and achieved an accuracy rate of 80% in high-level tasks.
arXiv Detail & Related papers (2025-03-25T02:23:47Z)
Node Classification and Search on the Rubik's Cube Graph with GNNs [55.2480439325792]
This study focuses on the application of deep geometric models to solve the 3x3x3 Rubik's Rubik. We begin by discussing the cube's graph representation and defining distance as the model's optimization objective. The distance approximation task is reformulated as a node classification problem, effectively addressed using Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2025-01-30T18:52:43Z)
CubeFormer: A Simple yet Effective Baseline for Lightweight Image Super-Resolution [55.94314421887744]
Lightweight image super-resolution (SR) methods aim at increasing the resolution and restoring the details of an image using a lightweight neural network. Our analysis reveals that these methods are hindered by constrained feature diversity, which adversely impacts feature representation and detail recovery. We propose a simple yet effective baseline called CubeFormer, designed to enhance feature richness by completing holistic information aggregation.
arXiv Detail & Related papers (2024-12-03T08:02:26Z)
Solving a Rubik's Cube Using its Local Graph Structure [13.219469732742354]
A Rubix Cube has six faces and twelve possible actions, leading to a small and unconstrained action space. A Rubix Cube can be represented as a graph, where states of the cube are nodes and actions are edges. Drawing on graph convolutional networks, we design a new search algorithm to find the solution to a scrambled Rubix Cube.
arXiv Detail & Related papers (2024-08-15T05:39:52Z)
Compute Better Spent: Replacing Dense Layers with Structured Matrices [77.61728033234233]
We identify more efficient alternatives to dense matrices, as exemplified by the success of convolutional networks in the image domain. We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance. We propose a novel matrix family containing Monarch matrices, the Block-Train, which we show performs better than dense for the same compute on multiple tasks.
arXiv Detail & Related papers (2024-06-10T13:25:43Z)
Language-Image Models with 3D Understanding [59.499585515469974]
We develop a large-scale pre-training dataset for 2D and 3D called LV3D. Next, we introduce a new MLLM named Cube-LLM and pre-train it on LV3D. We show that pure data scaling makes a strong 3D perception capability without 3D specific architectural design or training objective.
arXiv Detail & Related papers (2024-05-06T17:57:27Z)
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games [104.3339905200105]
This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games.
arXiv Detail & Related papers (2022-06-12T19:49:14Z)
Benchmarking Robot Manipulation with the Rubik's Cube [15.922643222904172]
We propose Rubik's cube manipulation as a benchmark to measure simultaneous performance of precise manipulation and sequential manipulation. We present a protocol for quantitatively measuring both the accuracy and speed of Rubik's cube manipulation. We demonstrate this protocol for two distinct baseline approaches on a PR2 robot.
arXiv Detail & Related papers (2022-02-14T22:34:18Z)
CubeTR: Learning to Solve The Rubiks Cube Using Transformers [0.0]
The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards. The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards.
arXiv Detail & Related papers (2021-11-11T03:17:28Z)
Solving Rubik's Cube via Quantum Mechanics and Deep Reinforcement Learning [0.0]
Rubik's Cube is one of the most famous puzzles involving nearly $4.3 times 1019$ possible configurations. We develop a unitary representation of Rubik's group and a quantum formalism to describe the Cube from its geometrical constraints. The Cube is solved in four phases, all based on a respective Hamiltonian reward based on its spectrum, inspired by the Ising model.
arXiv Detail & Related papers (2021-09-15T10:30:27Z)
Self-Supervision is All You Need for Solving Rubik's Cube [0.0]
This work introduces a simple and efficient deep learning method for solving problems with a predefined goal, represented by Rubik's Cube. We demonstrate that, for such problems, training a deep neural network on random scrambles branching from the goal state is sufficient to achieve near-optimal solutions.
arXiv Detail & Related papers (2021-06-06T15:38:50Z)
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups [115.58550697886987]
We provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before. Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems.
arXiv Detail & Related papers (2021-04-19T17:21:54Z)
Tangent Space Backpropagation for 3D Transformation Groups [71.41252518419486]
3D transformation groups are widely used in 3D vision and robotics. Standard backpropagation approach, which embeds 3D transformations in Euclidean spaces, suffers from numerical difficulties. We introduce a new library, which exploits the group structure of 3D transformations.
arXiv Detail & Related papers (2021-03-22T17:33:30Z)
Recurrent Multi-view Alignment Network for Unsupervised Surface Registration [79.72086524370819]
Learning non-rigid registration in an end-to-end manner is challenging due to the inherent high degrees of freedom and the lack of labeled training data. We propose to represent the non-rigid transformation with a point-wise combination of several rigid transformations. We also introduce a differentiable loss function that measures the 3D shape similarity on the projected multi-view 2D depth images.
arXiv Detail & Related papers (2020-11-24T14:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.