Online Submission and Evaluation System Design for Competition Operations
- URL: http://arxiv.org/abs/2507.17730v1
- Date: Wed, 23 Jul 2025 17:44:10 GMT
- Title: Online Submission and Evaluation System Design for Competition Operations
- Authors: Zhe Chen, Daniel Harabor, Ryan Hechnenberger, Nathan R. Sturtevant,
- Abstract summary: This paper presents an online competition system that automates the submission and evaluation process for a competition.<n>The system has already been used successfully for several competitions, including the Grid-Based Pathfinding Competition and the League of Robot Runners competition.
- Score: 16.589706967125252
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Research communities have developed benchmark datasets across domains to compare the performance of algorithms and techniques However, tracking the progress in these research areas is not easy, as publications appear in different venues at the same time, and many of them claim to represent the state-of-the-art. To address this, research communities often organise periodic competitions to evaluate the performance of various algorithms and techniques, thereby tracking advancements in the field. However, these competitions pose a significant operational burden. The organisers must manage and evaluate a large volume of submissions. Furthermore, participants typically develop their solutions in diverse environments, leading to compatibility issues during the evaluation of their submissions. This paper presents an online competition system that automates the submission and evaluation process for a competition. The competition system allows organisers to manage large numbers of submissions efficiently, utilising isolated environments to evaluate submissions. This system has already been used successfully for several competitions, including the Grid-Based Pathfinding Competition and the League of Robot Runners competition.
Related papers
- Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition [70.60872754129832]
First NeurIPS competition on unlearning sought to stimulate the development of novel algorithms.
Nearly 1,200 teams from across the world participated.
We analyze top solutions and delve into discussions on benchmarking unlearning.
arXiv Detail & Related papers (2024-06-13T12:58:00Z) - Analysis of Systems' Performance in Natural Language Processing Competitions [6.197993866688085]
This manuscript describes an evaluation methodology for statistically analyzing competition results and competition.
The proposed methodology offers several advantages, including off-the-shell comparisons with correction mechanisms and the inclusion of confidence intervals.
Our analysis shows the potential usefulness of our methodology for effectively evaluating competition results.
arXiv Detail & Related papers (2024-03-07T17:42:40Z) - Benchmarking Robustness and Generalization in Multi-Agent Systems: A
Case Study on Neural MMO [50.58083807719749]
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
This competition targets robustness and generalization in multi-agent systems.
We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
arXiv Detail & Related papers (2023-08-30T07:16:11Z) - Competitions in AI -- Robustly Ranking Solvers Using Statistical
Resampling [9.02080113915613]
We show that rankings resulting from the standard interpretation of competition results can be very sensitive to even minor changes in the benchmark instance set used as the basis for assessment.
We introduce a novel approach to statistically meaningful analysis of competition results based on resampling performance data.
Our approach produces confidence intervals of competition scores as well as statistically robust solver rankings with bounded error.
arXiv Detail & Related papers (2023-08-09T16:47:04Z) - EFaR 2023: Efficient Face Recognition Competition [51.77649060180531]
The paper presents the summary of the Efficient Face Recognition Competition (EFaR) held at the 2023 International Joint Conference on Biometrics (IJCB 2023)
The competition received 17 submissions from 6 different teams.
The submitted solutions are ranked based on a weighted score of the achieved verification accuracies on a diverse set of benchmarks, as well as the deployability given by the number of floating-point operations and model size.
arXiv Detail & Related papers (2023-08-08T09:58:22Z) - ICDAR 2023 Competition on Hierarchical Text Detection and Recognition [60.68100769639923]
The competition is aimed to promote research into deep learning models and systems that can jointly perform text detection and recognition.
We present details of the proposed competition organization, including tasks, datasets, evaluations, and schedule.
During the competition period (from January 2nd 2023 to April 1st 2023), at least 50 submissions from more than 20 teams were made in the 2 proposed tasks.
arXiv Detail & Related papers (2023-05-16T18:56:12Z) - A portfolio-based analysis method for competition results [0.8680676599607126]
I will describe a portfolio-based analysis method which can give complementary insights into the performance of participating solvers in a competition.
The method is demonstrated on the results of the MiniZinc Challenges and new insights gained from the portfolio viewpoint are presented.
arXiv Detail & Related papers (2022-05-30T20:20:45Z) - Towards robust and domain agnostic reinforcement learning competitions [12.731614722371376]
Reinforcement learning competitions have formed the basis for standard research benchmarks.
Despite this, a majority of challenges suffer from the same fundamental problems.
We present a new framework of competition design that promotes the development of algorithms that overcome these barriers.
arXiv Detail & Related papers (2021-06-07T16:15:46Z) - The MineRL 2020 Competition on Sample Efficient Reinforcement Learning
using Human Priors [62.9301667732188]
We propose a second iteration of the MineRL Competition.
The primary goal of the competition is to foster the development of algorithms which can efficiently leverage human demonstrations.
The competition is structured into two rounds in which competitors are provided several paired versions of the dataset and environment.
At the end of each round, competitors submit containerized versions of their learning algorithms to the AIcrowd platform.
arXiv Detail & Related papers (2021-01-26T20:32:30Z) - Analysing Affective Behavior in the First ABAW 2020 Competition [49.90617840789334]
The Affective Behavior Analysis in-the-wild (ABAW) 2020 Competition is the first Competition aiming at automatic analysis of the three main behavior tasks.
We describe this Competition, to be held in conjunction with the IEEE Conference on Face and Gesture Recognition, May 2020, in Buenos Aires, Argentina.
We outline the evaluation metrics, present both the baseline system and the top-3 performing teams' methodologies per Challenge and finally present their obtained results.
arXiv Detail & Related papers (2020-01-30T15:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.