Type-level Property Based Testing
- URL: http://arxiv.org/abs/2407.12726v1
- Date: Wed, 17 Jul 2024 16:43:41 GMT
- Title: Type-level Property Based Testing
- Authors: Thomas Ekström Hansen, Edwin Brady,
- Abstract summary: We present an automated framework for solidifying the cohesion between software specifications, their dependently typed models, and implementation at compile time.
We are able to model several interesting systems and network protocols, have the type checker verify that our implementation behaves as specified, and test that our model matches the specification's semantics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present an automated framework for solidifying the cohesion between software specifications, their dependently typed models, and implementation at compile time. Model Checking and type checking are currently separate techniques for automatically verifying the correctness of programs. Using Property Based Testing (PBT), Indexed State Monads (ISMs), and dependent types, we are able to model several interesting systems and network protocols, have the type checker verify that our implementation behaves as specified, and test that our model matches the specification's semantics; a step towards combining model and type checking.
Related papers
- Formal Analysis of the Contract Automata Runtime Environment with Uppaal: Modelling, Verification and Testing [0.3807314298073301]
A distributed runtime application called tt CARE has been introduced to realise service applications specified using a dialect of finite-state automata.
We detail the formal modelling, verification and testing of tt CARE.
arXiv Detail & Related papers (2025-01-22T15:03:25Z) - SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists [59.08999823652293]
We propose SYNTHEVAL to generate a wide range of test types for a comprehensive evaluation of NLP models.
In the last stage, human experts investigate the challenging examples, manually design templates, and identify the types of failures the taskspecific models consistently exhibit.
We apply SYNTHEVAL to two classification tasks, sentiment analysis and toxic language detection, and show that our framework is effective in identifying weaknesses of strong models on these tasks.
arXiv Detail & Related papers (2024-08-30T17:41:30Z) - Formal Verification Of A Shopping Basket Application Model Using PRISM [0.0]
We present the results of a simulation using Prism Model Checker for a Shopping Basket Application Model.
The objective is to simulate the behavior of shoppers as they go through a number of defined states of the shopping process.
arXiv Detail & Related papers (2023-07-16T00:14:40Z) - Zero-shot Model Diagnosis [80.36063332820568]
A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs.
This paper argues the case that Zero-shot Model Diagnosis (ZOOM) is possible without the need for a test set nor labeling.
arXiv Detail & Related papers (2023-03-27T17:59:33Z) - Language Model Cascades [72.18809575261498]
Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities.
Cases with control flow and dynamic structure require techniques from probabilistic programming.
We formalize several existing techniques from this perspective, including scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use.
arXiv Detail & Related papers (2022-07-21T07:35:18Z) - On the Limits of Evaluating Embodied Agent Model Generalization Using
Validation Sets [101.28658250723804]
This paper experiments with augmenting a transformer model with modules that effectively utilize a wider field of view and learn to choose whether the next step requires a navigation or manipulation action.
We observe that the proposed modules resulted in improved, and in fact state-of-the-art performance on an unseen validation set of a popular benchmark dataset, ALFRED.
We highlight this result as we believe it may be a wider phenomenon in machine learning tasks but primarily noticeable only in benchmarks that limit evaluations on test splits.
arXiv Detail & Related papers (2022-05-18T23:52:21Z) - Lifted Model Checking for Relational MDPs [12.574454799055026]
pCTL-REBEL is a lifted model checking approach for verifying pCTL properties on relational MDPs.
We show that the pCTL model checking approach is decidable for relational MDPs even for possibly infinite domains.
arXiv Detail & Related papers (2021-06-22T13:12:36Z) - DirectDebug: Automated Testing and Debugging of Feature Models [55.41644538483948]
Variability models (e.g., feature models) are a common way for the representation of variabilities and commonalities of software artifacts.
Complex and often large-scale feature models can become faulty, i.e., do not represent the expected variability properties of the underlying software artifact.
arXiv Detail & Related papers (2021-02-11T11:22:20Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z) - An Automated Approach for the Discovery of Interoperability [0.0]
We show that exchanging models in standard format does not guarantee the preservation of shape properties.
Our method could be extended to interoperability testing on CAD-to-CAE and/or CAD-to-CAM interactions.
arXiv Detail & Related papers (2020-01-26T06:07:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.