Solving Rubik's Cube Without Tricky Sampling
- URL: http://arxiv.org/abs/2411.19583v1
- Date: Fri, 29 Nov 2024 09:56:40 GMT
- Title: Solving Rubik's Cube Without Tricky Sampling
- Authors: Yicheng Lin, Siyu Liang,
- Abstract summary: The Rubiks Cube, with its vast state space and sparse reward structure, presents a significant challenge for reinforcement learning.
Previous research addressed this by propagating cost-to-go estimates from the solved state and incorporating search techniques.
We introduce a novel RL algorithm using policy gradient methods to solve the Rubiks Cube without relying on near solved-state sampling.
- Score: 0.6445605125467574
- License:
- Abstract: The Rubiks Cube, with its vast state space and sparse reward structure, presents a significant challenge for reinforcement learning (RL) due to the difficulty of reaching rewarded states. Previous research addressed this by propagating cost-to-go estimates from the solved state and incorporating search techniques. These approaches differ from human strategies that start from fully scrambled cubes, which can be tricky for solving a general sparse-reward problem. In this paper, we introduce a novel RL algorithm using policy gradient methods to solve the Rubiks Cube without relying on near solved-state sampling. Our approach employs a neural network to predict cost patterns between states, allowing the agent to learn directly from scrambled states. Our method was tested on the 2x2x2 Rubiks Cube, where the cube was scrambled 50,000 times, and the model successfully solved it in over 99.4% of cases. Notably, this result was achieved using only the policy network without relying on tree search as in previous methods, demonstrating its effectiveness and potential for broader applications in sparse-reward problems.
Related papers
- A Machine Learning Approach That Beats Large Rubik's Cubes [32.8176720435354]
The paper proposes a novel machine learning-based approach to the pathfinding problem on extremely large graphs.
We demonstrate its efficiency by finding solutions for 4x4x4 and 5x5x5 Rubik's cubes with unprecedentedly short solution lengths.
arXiv Detail & Related papers (2025-02-18T20:22:38Z) - Node Classification and Search on the Rubik's Cube Graph with GNNs [55.2480439325792]
This study focuses on the application of deep geometric models to solve the 3x3x3 Rubik's Rubik.
We begin by discussing the cube's graph representation and defining distance as the model's optimization objective.
The distance approximation task is reformulated as a node classification problem, effectively addressed using Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2025-01-30T18:52:43Z) - Smart Cubing for Graph Search: A Comparative Study [15.989407883187962]
Parallel solving via cube-and-conquer is a key method for scaling SAT solvers to hard instances.
While cube-and-conquer has proven successful for pure SAT problems, its application to SAT solvers extended with propagators presents unique challenges.
We study this problem using SAT Modulo Symmetries (SMS) as our primary test case, where a symmetry-breaking propagator reduces the search space by learning constraints that eliminate isomorphic graphs.
arXiv Detail & Related papers (2025-01-27T22:15:54Z) - Solving a Rubik's Cube Using its Local Graph Structure [13.219469732742354]
A Rubix Cube has six faces and twelve possible actions, leading to a small and unconstrained action space.
A Rubix Cube can be represented as a graph, where states of the cube are nodes and actions are edges.
Drawing on graph convolutional networks, we design a new search algorithm to find the solution to a scrambled Rubix Cube.
arXiv Detail & Related papers (2024-08-15T05:39:52Z) - A Novel Normalized-Cut Solver with Nearest Neighbor Hierarchical
Initialization [107.07093621337084]
Normalized-Cut (N-Cut) is a famous model of spectral clustering.
Traditional N-Cut solvers are two-stage: 1) calculating the continuous spectral embedding of normalized Laplacian matrix; 2) discretization via $K$-means or spectral rotation.
We propose a novel N-Cut solver based on the famous coordinate descent method.
arXiv Detail & Related papers (2023-11-26T07:11:58Z) - Polynomial-Time Solutions for ReLU Network Training: A Complexity
Classification via Max-Cut and Zonotopes [70.52097560486683]
We prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.
In particular, when $epsilonleqsqrt84/83-1approx 0.006$, we show that it is NP-hard to find an approximate global dataset of the ReLU network objective with relative error $epsilon$ with respect to the objective value.
arXiv Detail & Related papers (2023-11-18T04:41:07Z) - Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks.
We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z) - TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent
Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions.
We show this disparity can largely be attributed to challenges presented by non-NISTity.
We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z) - CubeTR: Learning to Solve The Rubiks Cube Using Transformers [0.0]
The Rubiks cube has a single solved state for quintillions of possible configurations which leads to extremely sparse rewards.
The proposed model CubeTR attends to longer sequences of actions and addresses the problem of sparse rewards.
arXiv Detail & Related papers (2021-11-11T03:17:28Z) - Self-Supervision is All You Need for Solving Rubik's Cube [0.0]
This work introduces a simple and efficient deep learning method for solving problems with a predefined goal, represented by Rubik's Cube.
We demonstrate that, for such problems, training a deep neural network on random scrambles branching from the goal state is sufficient to achieve near-optimal solutions.
arXiv Detail & Related papers (2021-06-06T15:38:50Z) - Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot [55.37967301483917]
Conventional wisdom of pruning algorithms suggests that pruning methods exploit information from training data to find goodworks.
In this paper, we conduct sanity checks for the above beliefs on several recent unstructured pruning methods.
We propose a series of simple emphdata-independent prune ratios for each layer, and randomly prune each layer accordingly to get a subnetwork.
arXiv Detail & Related papers (2020-09-22T17:36:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.