Amplifying human performance in combinatorial competitive programming
- URL: http://arxiv.org/abs/2411.19744v1
- Date: Fri, 29 Nov 2024 14:40:36 GMT
- Title: Amplifying human performance in combinatorial competitive programming
- Authors: Petar Veličković, Alex Vitvitskyi, Larisa Markeeva, Borja Ibarz, Lars Buesing, Matej Balog, Alexander Novikov,
- Abstract summary: We focus on competitive programming, where the target is to find as-good-as-possible solutions to intractable problems.<n>We deploy our approach on previous iterations of Hash Code, a global team programming competition inspired by NP-hard software engineering problems at Google.<n>Our solutions significantly improve the attained scores from their baseline, successfully breaking into the top percentile on all previous Hash Code online qualification rounds.
- Score: 41.59043428241635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have seen a significant surge in complex AI systems for competitive programming, capable of performing at admirable levels against human competitors. While steady progress has been made, the highest percentiles still remain out of reach for these methods on standard competition platforms such as Codeforces. Here we instead focus on combinatorial competitive programming, where the target is to find as-good-as-possible solutions to otherwise computationally intractable problems, over specific given inputs. We hypothesise that this scenario offers a unique testbed for human-AI synergy, as human programmers can write a backbone of a heuristic solution, after which AI can be used to optimise the scoring function used by the heuristic. We deploy our approach on previous iterations of Hash Code, a global team programming competition inspired by NP-hard software engineering problems at Google, and we leverage FunSearch to evolve our scoring functions. Our evolved solutions significantly improve the attained scores from their baseline, successfully breaking into the top percentile on all previous Hash Code online qualification rounds, and outperforming the top human teams on several. Our method is also performant on an optimisation problem that featured in a recent held-out AtCoder contest.
Related papers
- ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering [1.6932802756478724]
We introduce ALE-Bench, a new benchmark for evaluating AI systems on score-based algorithmic programming contests.<n>ALE-Bench presents optimization problems that are computationally hard and admit no known exact solution.<n>Our software framework supports interactive agent architectures that leverage test-run feedback and visualizations.
arXiv Detail & Related papers (2025-06-10T17:59:56Z) - CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming [56.17331530444765]
CPRet is a retrieval-oriented benchmark suite for competitive programming.<n>It covers four retrieval tasks: two code-centric (i.e., Text-to-Code and Code-to-Code) and two newly proposed problem-centric tasks (i.e., Problem-to-Duplicate and Simplified-to-Full)<n>Our contribution includes both high-quality training data and temporally separated test sets for reliable evaluation.
arXiv Detail & Related papers (2025-05-19T10:07:51Z) - CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings [70.95565672516979]
Existing benchmarks, like LiveCodeBench and USACO, fall short due to the unavailability of private test cases, lack of support for special judges, and misaligned execution environments.
CodeElo is a standardized competition-level code generation benchmark that effectively addresses all these challenges for the first time.
arXiv Detail & Related papers (2025-01-02T13:49:00Z) - CompilerDream: Learning a Compiler World Model for General Code Optimization [58.87557583347996]
We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization.
It comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies.
It excels across diverse datasets, surpassing LLVM's built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.
arXiv Detail & Related papers (2024-04-24T09:20:33Z) - SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving [64.38649623473626]
Large Language Models (LLMs) have driven substantial progress in artificial intelligence.
We propose a novel framework called textbfSEquential subtextbfGoal textbfOptimization (SEGO) to enhance LLMs' ability to solve mathematical problems.
arXiv Detail & Related papers (2023-10-19T17:56:40Z) - Design and Implementation of an Heuristic-Enhanced Branch-and-Bound
Solver for MILP [1.03905835096574]
We present a solver for Mixed Programs (MIP) developed for the MIP competition.
Given the 10 minutes bound on the computational time established by the rules of the competition, our method focuses on finding a feasible solution.
Our combined methods, when implemented with extensive computational power, can solve 11 of the 19 problems of the training data set.
arXiv Detail & Related papers (2022-06-04T00:09:02Z) - Retrospective on the 2021 BASALT Competition on Learning from Human
Feedback [92.37243979045817]
The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.
Rather than mandating the use of LfHF techniques, we described four tasks in natural language to be accomplished in the video game Minecraft.
Teams developed a diverse range of LfHF algorithms across a variety of possible human feedback types.
arXiv Detail & Related papers (2022-04-14T17:24:54Z) - Competition-Level Code Generation with AlphaCode [74.87216298566942]
We introduce AlphaCode, a system for code generation that can create novel solutions to problems that require deeper reasoning.
In simulated evaluations on recent programming competitions on the Codeforces platform, AlphaCode achieved on average a ranking of top 54.3%.
arXiv Detail & Related papers (2022-02-08T23:16:31Z) - Yordle: An Efficient Imitation Learning for Branch and Bound [1.6758573326215689]
This work presents our solution and insights gained by team qqy in the 2021 NeurIPS Machine Learning for Combinatorial Optimization (ML4CO) competition.
Our solution is a highly efficient imitation learning framework for performance improvement of Branch and Bound (B&B), named Yordle.
In our experiments, Yordle greatly outperforms the baseline algorithm adopted by the competition while requiring significantly less time and amounts of data to train the decision model.
arXiv Detail & Related papers (2022-02-02T14:46:30Z) - Adversarial Deep Learning for Online Resource Allocation [12.118811903399951]
We use deep neural networks to learn an online algorithm for a resource allocation and pricing problem from scratch.
Our work is the first using deep neural networks to design an online algorithm from the perspective of worst-case performance guarantee.
arXiv Detail & Related papers (2021-11-19T15:48:43Z) - Recent Developments in Program Synthesis with Evolutionary Algorithms [1.8047694351309207]
We identify the relevant evolutionary program synthesis approaches and provide an in-depth analysis of their performance.
The most influential approaches we identify are stack-based, grammar-guided, as well as linear genetic programming.
For future work, we encourage researchers not only to use a program's output for assessing the quality of a solution but also the way towards a solution.
arXiv Detail & Related papers (2021-08-27T11:38:27Z) - Combining Reinforcement Learning and Constraint Programming for
Combinatorial Optimization [5.669790037378094]
The goal is to find an optimal solution among a finite set of possibilities.
Deep reinforcement learning (DRL) has shown its promise for solving NP-hard optimization problems.
constraint programming (CP) is a generic tool to solve optimization problems.
In this work, we propose a general and hybrid approach, based on DRL and CP, for solving optimization problems.
arXiv Detail & Related papers (2020-06-02T13:54:27Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.