OpenCoderRank: AI-Driven Technical Assessments Made Easy
- URL: http://arxiv.org/abs/2509.06774v1
- Date: Mon, 08 Sep 2025 14:58:10 GMT
- Title: OpenCoderRank: AI-Driven Technical Assessments Made Easy
- Authors: Hridoy Sankar Dutta, Sana Ansari, Swati Kumari, Shounak Ravi Bhalerao,
- Abstract summary: This paper introduces OpenCoderRank, an easy-to-use platform designed to simulate technical assessments.<n>It acts as a bridge between problem setters and problem solvers, helping solvers prepare for time constraints and unfamiliar problems.
- Score: 1.0499611180329802
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Organizations and educational institutions use time-bound assessment tasks to evaluate coding and problem-solving skills. These assessments measure not only the correctness of the solutions, but also their efficiency. Problem setters (educator/interviewer) are responsible for crafting these challenges, carefully balancing difficulty and relevance to create meaningful evaluation experiences. Conversely, problem solvers (student/interviewee) apply coding efficiency and logical thinking to arrive at correct solutions. In the era of Large Language Models (LLMs), LLMs assist problem setters in generating diverse and challenging questions, but they can undermine assessment integrity for problem solvers by providing easy access to solutions. This paper introduces OpenCoderRank, an easy-to-use platform designed to simulate technical assessments. It acts as a bridge between problem setters and problem solvers, helping solvers prepare for time constraints and unfamiliar problems while allowing setters to self-host assessments, offering a no-cost and customizable solution for technical assessments in resource-constrained environments.
Related papers
- FrontierCS: Evolving Challenges for Evolving Intelligence [174.80075821079708]
We introduce FrontierCS, a benchmark of 156 open-ended problems across diverse areas of computer science.<n>For each problem we provide an expert reference solution and an automatic evaluator.<n>We find that frontier reasoning models still lag far behind human experts on both the algorithmic and research tracks.
arXiv Detail & Related papers (2025-12-17T18:52:45Z) - Learning the Boundary of Solvability: Aligning LLMs to Detect Unsolvable Problems [51.62477754641947]
We propose UnsolvableQA and UnsolvableRL to solve feasible problems, detect inherent contradictions, and prudently refuse tasks beyond capability.<n>Specifically, we construct UnsolvableQA, a dataset of paired solvable and unsolvable instances derived via a dual-track methodology.<n>Building on this dataset, we introduce UnsolvableRL, a reinforcement learning framework with three reward components jointly accounting for accuracy, unsolvability, and difficulty.
arXiv Detail & Related papers (2025-12-01T13:32:59Z) - UQ: Assessing Language Models on Unsolved Questions [149.46593270027697]
We introduce UQ, a testbed of 500 challenging, diverse questions sourced from Stack Exchange.<n>UQ is difficult and realistic by construction: unsolved questions are often hard and naturally arise when humans seek answers.<n>The top model passes UQ-validation on only 15% of questions, and preliminary human verification has already identified correct answers.
arXiv Detail & Related papers (2025-08-25T01:07:59Z) - Beyond Solving Math Quiz: Evaluating the Ability of Large Reasoning Models to Ask for Information [21.562453754113072]
Large Reasoning Models (LRMs) have demonstrated remarkable problem-solving abilities in mathematics.<n>We propose a new dataset consisting of two types of incomplete problems with diverse contexts.<n>Based on the dataset, our systematical evaluation of LRMs reveals their inability in proactively asking for information.
arXiv Detail & Related papers (2025-08-15T06:42:00Z) - BloomWise: Enhancing Problem-Solving capabilities of Large Language Models using Bloom's-Taxonomy-Inspired Prompts [59.83547898874152]
BloomWise is a cognitively-inspired prompting technique for large language models (LLMs)<n>It is designed to enhance LLMs' performance on mathematical problem solving while making their solutions more explainable.
arXiv Detail & Related papers (2024-10-05T09:27:52Z) - Estimating Difficulty Levels of Programming Problems with Pre-trained Model [18.92661958433282]
The difficulty level of each programming problem serves as an essential reference for guiding students' adaptive learning.
We formulate the problem of automatic difficulty level estimation of each programming problem, given its textual description and a solution example of code.
For tackling this problem, we propose to couple two pre-trained models, one for text modality and the other for code modality, into a unified model.
arXiv Detail & Related papers (2024-06-13T05:38:20Z) - Learning Task Decomposition to Assist Humans in Competitive Programming [90.4846613669734]
We introduce a novel objective for learning task decomposition, termed value (AssistV)<n>We collect a dataset of human repair experiences on different decomposed solutions.<n>Under 177 hours of human study, our method enables non-experts to solve 33.3% more problems, speeds them up by 3.3x, and empowers them to match unassisted experts.
arXiv Detail & Related papers (2024-06-07T03:27:51Z) - Probeable Problems for Beginner-level Programming-with-AI Contests [0.0]
We conduct a 2-hour programming contest for undergraduate Computer Science students from multiple institutions.
Students were permitted to work individually or in groups, and were free to use AI tools.
We analyze the extent to which the code submitted by these groups identifies missing details and identify ways in which Probeable Problems can support learning in formal and informal CS educational contexts.
arXiv Detail & Related papers (2024-05-24T00:39:32Z) - Competition-Level Problems are Effective LLM Evaluators [121.15880285283116]
This paper aims to evaluate the reasoning capacities of large language models (LLMs) in solving recent programming problems in Codeforces.
We first provide a comprehensive evaluation of GPT-4's peiceived zero-shot performance on this task, considering various aspects such as problems' release time, difficulties, and types of errors encountered.
Surprisingly, theThoughtived performance of GPT-4 has experienced a cliff like decline in problems after September 2021 consistently across all the difficulties and types of problems.
arXiv Detail & Related papers (2023-12-04T18:58:57Z) - Steps Before Syntax: Helping Novice Programmers Solve Problems using the
PCDIT Framework [2.768397481213625]
Novice programmers often struggle with problem solving due to the high cognitive loads they face.
Many introductory programming courses do not explicitly teach it, assuming that problem solving skills are acquired along the way.
We present 'PCDIT', a non-linear problem solving framework that provides scaffolding to guide novice programmers through the process of transforming a problem specification into an implemented and tested solution for an imperative programming language.
arXiv Detail & Related papers (2021-09-18T10:31:15Z) - Probably Approximately Correct Constrained Learning [135.48447120228658]
We develop a generalization theory based on the probably approximately correct (PAC) learning framework.
We show that imposing a learner does not make a learning problem harder in the sense that any PAC learnable class is also a constrained learner.
We analyze the properties of this solution and use it to illustrate how constrained learning can address problems in fair and robust classification.
arXiv Detail & Related papers (2020-06-09T19:59:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.