Crucible: Graphical Test Cases for Alloy Models
- URL: http://arxiv.org/abs/2307.06922v1
- Date: Thu, 13 Jul 2023 17:43:12 GMT
- Title: Crucible: Graphical Test Cases for Alloy Models
- Authors: Adam G. Emerson, Allison Sullivan
- Abstract summary: This paper introduces Crucible, which allows users to graphically create AUnit test cases.
Crucible provides automated guidance to users to ensure they are creating well structured, valuable test cases.
- Score: 0.76146285961466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Alloy is a declarative modeling language that is well suited for verifying
system designs. Alloy models are automatically analyzed using the Analyzer, a
toolset that helps the user understand their system by displaying the
consequences of their properties, helping identify any missing or incorrect
properties, and exploring the impact of modifications to those properties. To
achieve this, the Analyzer invokes off-the-shelf SAT solvers to search for
scenarios, which are assignments to the sets and relations of the model such
that all executed formulas hold. To help write more accurate software models,
Alloy has a unit testing framework, AUnit, which allows users to outline
specific scenarios and check if those scenarios are correctly generated or
prevented by their model. Unfortunately, AUnit currently only supports textual
specifications of scenarios. This paper introduces Crucible, which allows users
to graphically create AUnit test cases. In addition, Crucible provides
automated guidance to users to ensure they are creating well structured,
valuable test cases. As a result, Crucible eases the burden of adopting AUnit
and brings AUnit test case creation more in line with how Alloy scenarios are
commonly interacted with, which is graphically.
Related papers
- Structure Editor for Building Software Models [0.5735035463793009]
A recent study of over 93,000 new user models reveals that users have trouble from the very start: nearly a third of the models novices write fail to compile.
We believe that the issue is that Alloy's grammar and type information is passively relayed to the user despite this information outlining a narrow path for how to compose valid formulas.
In this paper, we outline a proof-of-concept for a structure editor for Alloy in which user's build their models using block based inputs, rather than free typing, which by design prevents compilation errors.
arXiv Detail & Related papers (2024-06-13T18:21:02Z) - Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars [66.823588073584]
Large language models (LLMs) have shown impressive capabilities in real-world applications.
The quality of these exemplars in the prompt greatly impacts performance.
Existing methods fail to adequately account for the impact of exemplar ordering on the performance.
arXiv Detail & Related papers (2024-05-25T08:23:05Z) - Right or Wrong -- Understanding How Novice Users Write Software Models [0.6445605125467574]
This paper presents an empirical study of over 97,000 models written by novice users trying to learn Alloy.
We investigate how users write both correct and incorrect models in order to produce a comprehensive benchmark for future use.
arXiv Detail & Related papers (2024-02-09T18:56:57Z) - Prompting Code Interpreter to Write Better Unit Tests on Quixbugs
Functions [0.05657375260432172]
Unit testing is a commonly-used approach in software engineering to test the correctness and robustness of written code.
In this study, we explore the effect of different prompts on the quality of unit tests generated by Code Interpreter.
We find that the quality of the generated unit tests is not sensitive to changes in minor details in the prompts provided.
arXiv Detail & Related papers (2023-09-30T20:36:23Z) - Zero-Shot Object Counting with Language-Vision Models [50.1159882903028]
Class-agnostic object counting aims to count object instances of an arbitrary class at test time.
Current methods require human-annotated exemplars as inputs which are often unavailable for novel categories.
We propose zero-shot object counting (ZSC), a new setting where only the class name is available during test time.
arXiv Detail & Related papers (2023-09-22T14:48:42Z) - Revisiting and Improving Retrieval-Augmented Deep Assertion Generation [13.373681113601982]
Unit testing has become an essential activity in software development process.
Yu et al. proposed an integrated approach (integration for short) to generate assertions for a unit test.
Despite promising, there is still a knowledge gap as to why or where integration works or does not work.
arXiv Detail & Related papers (2023-09-19T02:39:02Z) - Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs.
This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z) - APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic
Segmentation [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images.
Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype.
We present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes.
arXiv Detail & Related papers (2021-11-24T04:38:37Z) - Auditing AI models for Verified Deployment under Semantic Specifications [65.12401653917838]
AuditAI bridges the gap between interpretable formal verification and scalability.
We show how AuditAI allows us to obtain controlled variations for verification and certified training while addressing the limitations of verifying using only pixel-space perturbations.
arXiv Detail & Related papers (2021-09-25T22:53:24Z) - Generating Accurate Assert Statements for Unit Test Cases using
Pretrained Transformers [10.846226514357866]
Unit testing represents the foundational basis of the software testing pyramid.
We present an approach to support developers in writing unit test cases by generating accurate and useful assert statements.
arXiv Detail & Related papers (2020-09-11T19:35:09Z) - Beyond Accuracy: Behavioral Testing of NLP models with CheckList [66.42971817954806]
CheckList is a task-agnostic methodology for testing NLP models.
CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation.
In a user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.
arXiv Detail & Related papers (2020-05-08T15:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.