Related papers: CASET: Complexity Analysis using Simple Execution Traces for CS* submissions

CASET: Complexity Analysis using Simple Execution Traces for CS* submissions

URL: http://arxiv.org/abs/2410.15419v1
Date: Sun, 20 Oct 2024 15:29:50 GMT
Title: CASET: Complexity Analysis using Simple Execution Traces for CS* submissions
Authors: Aaryen Mehta, Gagan Aryan,
Abstract summary: The most common method to auto-grade a student's submission in a CS1 or a CS2 course is to run it against a pre-defined test suite and compare the results against reference results. This technique cannot be used if the correctness of the solution goes beyond simple output, such as the algorithm used to obtain the result. We propose CASET, a novel tool to analyze the time complexity of algorithms using dynamic traces and unsupervised machine learning.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The most common method to auto-grade a student's submission in a CS1 or a CS2 course is to run it against a pre-defined test suite and compare the results against reference results. However, this technique cannot be used if the correctness of the solution goes beyond simple output, such as the algorithm used to obtain the result. There is no convenient method for the graders to identify the kind of algorithm used in solving a problem. They must read the source code and understand the algorithm implemented and its features, which makes the process tedious. We propose CASET(Complexity Analysis using Simple Execution Traces), a novel tool to analyze the time complexity of algorithms using dynamic traces and unsupervised machine learning. CASET makes it convenient for tutors to classify the submissions for a program into time complexity baskets. Thus, tutors can identify the algorithms used by the submissions without necessarily going through the code written by the students. CASET's analysis can be used to improve grading and provide detailed feedback for submissions that try to match the results without a proper algorithm, for example, hard-coding a binary result, pattern-matching the visible or common inputs. We show the effectiveness of CASET by computing the time complexity of many classes of algorithms like sorting, searching and those using dynamic programming paradigm.

Related papers

Discovering Algorithms with Computational Language Processing [0.7062238472483737]
We present a framework automating algorithm discovery by conceptualizing them as sequences of operations, represented as tokens.<n>These computational tokens are chained using a grammar, enabling the formation of increasingly sophisticated procedures.<n>Our ensemble Monte Carlo tree search (MCTS) guided by reinforcement learning (RL) explores token chaining and drives the creation of new tokens.
arXiv Detail & Related papers (2025-07-03T21:45:17Z)
A General Online Algorithm for Optimizing Complex Performance Metrics [5.726378955570775]
We introduce and analyze a general online algorithm that can be used in a straightforward way with a variety of complex performance metrics in binary, multi-class, and multi-label classification problems. The algorithm's update and prediction rules are appealingly simple and computationally efficient without the need to store any past data.
arXiv Detail & Related papers (2024-06-20T21:24:47Z)
Learning Hidden Markov Models Using Conditional Samples [72.20944611510198]
This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM) In this paper, we consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs. Specifically, we obtain efficient algorithms for learning HMMs in settings where we have query access to the exact conditional probabilities.
arXiv Detail & Related papers (2023-02-28T16:53:41Z)
Unified Functional Hashing in Automatic Machine Learning [58.77232199682271]
We show that large efficiency gains can be obtained by employing a fast unified functional hash. Our hash is "functional" in that it identifies equivalent candidates even if they were represented or coded differently. We show dramatic improvements on multiple AutoML domains, including neural architecture search and algorithm discovery.
arXiv Detail & Related papers (2023-02-10T18:50:37Z)
The CLRS Algorithmic Reasoning Benchmark [28.789225199559834]
Learning representations of algorithms is an emerging area of machine learning, seeking to bridge concepts from neural networks with classical algorithms. We propose the CLRS Algorithmic Reasoning Benchmark, covering classical algorithms from the Introduction to Algorithms textbook. Our benchmark spans a variety of algorithmic reasoning procedures, including sorting, searching, dynamic programming, graph algorithms, string algorithms and geometric algorithms.
arXiv Detail & Related papers (2022-05-31T09:56:44Z)
Accelerating ERM for data-driven algorithm design using output-sensitive techniques [26.32088674030797]
We study techniques to develop efficient learning algorithms for data-driven algorithm design. Our approach involves two novel ingredients -- an output-sensitive algorithm for enumerating polytopes induced by a set of hyperplanes. We illustrate our techniques by giving algorithms for pricing problems, linkage-based clustering and dynamic-programming based sequence alignment.
arXiv Detail & Related papers (2022-04-07T17:27:18Z)
Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers [12.09273192079783]
We consider interactive learning in the realizable setting and develop a general framework to handle problems ranging from best arm identification to active classification. We design novel computationally efficient algorithms for the realizable setting that match the minimax lower bound up to logarithmic factors.
arXiv Detail & Related papers (2021-11-09T02:33:36Z)
Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Strong Generalization and Efficiency in Neural Programs [69.18742158883869]
We study the problem of learning efficient algorithms that strongly generalize in the framework of neural program induction. By carefully designing the input / output interfaces of the neural model and through imitation, we are able to learn models that produce correct results for arbitrary input sizes.
arXiv Detail & Related papers (2020-07-07T17:03:02Z)
Learning to Accelerate Heuristic Searching for Large-Scale Maximum Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc. Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource. We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)
Stacked Generalizations in Imbalanced Fraud Data Sets using Resampling Methods [2.741266294612776]
This study uses stacked generalization, which is a two-step process of combining machine learning methods, called meta or super learners, for improving the performance of algorithms. Building a test harness that accounts for all permutations of algorithm sample set pairs demonstrates that the complex, intrinsic data structures are all thoroughly tested.
arXiv Detail & Related papers (2020-04-03T20:38:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.