Related papers: Benchmarking ChatGPT on Algorithmic Reasoning

Benchmarking ChatGPT on Algorithmic Reasoning

URL: http://arxiv.org/abs/2404.03441v2
Date: Tue, 16 Apr 2024 21:32:47 GMT
Title: Benchmarking ChatGPT on Algorithmic Reasoning
Authors: Sean McLeish, Avi Schwarzschild, Tom Goldstein,
Abstract summary: We evaluate ChatGPT's ability to solve algorithm problems from the CLRS benchmark suite that is designed for GNNs. We find that ChatGPT outperforms specialist GNN models, using Python to successfully solve these problems.
Score: 58.50071292008407
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We evaluate ChatGPT's ability to solve algorithm problems from the CLRS benchmark suite that is designed for GNNs. The benchmark requires the use of a specified classical algorithm to solve a given problem. We find that ChatGPT outperforms specialist GNN models, using Python to successfully solve these problems. This raises new points in the discussion about learning algorithms with neural networks and how we think about what out of distribution testing looks like with web scale training data.

Related papers

Deep Equilibrium Algorithmic Reasoning [18.651333116786084]
We study neurally solving algorithms from a different perspective. Since the algorithm's solution is often an equilibrium, it is possible to find the solution directly by solving an equilibrium equation. Our approach requires no information on the ground-truth number of steps of the algorithm, both during train and test time.
arXiv Detail & Related papers (2024-10-19T10:40:55Z)
The Deep Equilibrium Algorithmic Reasoner [20.375241527453447]
We show that graph neural networks (GNNs) can learn to execute classical algorithms. We conjecture and empirically validate that one can train a network to solve algorithmic problems by directly finding the equilibrium.
arXiv Detail & Related papers (2024-02-09T14:46:50Z)
Latent Space Representations of Neural Algorithmic Reasoners [15.920449080528536]
We perform a detailed analysis of the structure of the latent space induced by the GNN when executing algorithms. We identify two possible failure modes: (i) loss of resolution, making it hard to distinguish similar values; (ii) inability to deal with values outside the range observed during training. We show that these changes lead to improvements on the majority of algorithms in the standard CLRS-30 benchmark when using the state-of-the-art Triplet-GMPNN processor.
arXiv Detail & Related papers (2023-07-17T22:09:12Z)
Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structures [0.6990493129893112]
We evaluate ChatGPT's ability to generate correct solutions to the problems fed to it, its code quality, and nature of run-time errors thrown by its code. We look into patterns in the test cases passed in order to gain some insights into how wrong ChatGPT code is in these kinds of situations.
arXiv Detail & Related papers (2023-07-10T08:20:34Z)
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks. The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z)
The CLRS Algorithmic Reasoning Benchmark [28.789225199559834]
Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms. We propose the CLRS Algorithmic Reasoning Benchmark, covering classical algorithms from the Introduction to Algorithms textbook. Our benchmark spans a variety of algorithmic reasoning procedures, including sorting, searching, dynamic programming, graph algorithms, string algorithms and geometric algorithms.
arXiv Detail & Related papers (2022-05-31T09:56:44Z)
GNNRank: Learning Global Rankings from Pairwise Comparisons via Directed Graph Neural Networks [68.61934077627085]
We introduce GNNRank, a modeling framework compatible with any GNN capable of learning digraph embeddings. We show that our methods attain competitive and often superior performance compared with existing approaches.
arXiv Detail & Related papers (2022-02-01T04:19:50Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
Graph Neural Networks for Motion Planning [108.51253840181677]
We present two techniques, GNNs over dense fixed graphs for low-dimensional problems and sampling-based GNNs for high-dimensional problems. We examine the ability of a GNN to tackle planning problems such as identifying critical nodes or learning the sampling distribution in Rapidly-exploring Random Trees (RRT) Experiments with critical sampling, a pendulum and a six DoF robot arm show GNNs improve on traditional analytic methods as well as learning approaches using fully-connected or convolutional neural networks.
arXiv Detail & Related papers (2020-06-11T08:19:06Z)
Model Selection in Contextual Stochastic Bandit Problems [51.94632035240787]
We develop a meta-algorithm that selects between base algorithms. We show through a lower bound that even when one of the base algorithms has $O(sqrtT)$ regret, in general it is impossible to get better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2020-03-03T18:46:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.