Related papers: Trustworthy Distributed Certification of Program Execution

Trustworthy Distributed Certification of Program Execution

URL: http://arxiv.org/abs/2402.13792v1
Date: Wed, 21 Feb 2024 13:21:37 GMT
Title: Trustworthy Distributed Certification of Program Execution
Authors: Alex Wolf, Marco Eduardo Palma, Pasquale Salza, Harald C. Gall
Abstract summary: We propose an innovative approach that combines a prototype programming language called Mona with a certification protocol OCCP. Our protocol allows for certification of program segments in a distributed, immutable, and trustworthy system without the need for naive re-execution. Our findings demonstrate the efficiency of our approach in reducing the number of program executions compared to existing state-of-the-art methods.
Score: 2.208443815105053
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Verifying the execution of a program is complicated and often limited by the inability to validate the code's correctness. It is a crucial aspect of scientific research, where it is needed to ensure the reproducibility and validity of experimental results. Similarly, in customer software testing, it is difficult for customers to verify that their specific program version was tested or executed at all. Existing state-of-the-art solutions, such as hardware-based approaches, constraint solvers, and verifiable computation systems, do not provide definitive proof of execution, which hinders reliable testing and analysis of program results. In this paper, we propose an innovative approach that combines a prototype programming language called Mona with a certification protocol OCCP to enable the distributed and decentralized re-execution of program segments. Our protocol allows for certification of program segments in a distributed, immutable, and trustworthy system without the need for naive re-execution, resulting in significant improvements in terms of time and computational resources used. We also explore the use of blockchain technology to manage the protocol workflow following other approaches in this space. Our approach offers a promising solution to the challenges of program execution verification and opens up opportunities for further research and development in this area. Our findings demonstrate the efficiency of our approach in reducing the number of program executions compared to existing state-of-the-art methods, thus improving the efficiency of certifying program executions.

Related papers

AI-Driven Tools in Modern Software Quality Assurance: An Assessment of Benefits, Challenges, and Future Directions [0.0]
The research aims to assess the benefits, challenges, and prospects of integrating modern AI-oriented tools into quality assurance processes.<n>The research demonstrates AI's transformative potential for QA but highlights the importance of a strategic approach to implementing these technologies.
arXiv Detail & Related papers (2025-06-19T20:22:47Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation [52.83870601473094]
Embodied agents exhibit immense potential across a multitude of domains. Existing research predominantly concentrates on the security of general large language models. This paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents.
arXiv Detail & Related papers (2025-04-22T08:34:35Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
NLP-Based .NET CLR Event Logs Analyzer [0.0]
We present a tool for analyzing.NET CLR event logs based on a novel method inspired by Natural Language Processing (NLP) approach. We utilize a BERT-based architecture with an enhanced tokenization process customized to event logs. Our experiments demonstrate the efficacy of our approach in compressing event sequences, detecting recurring patterns, and identifying anomalies.
arXiv Detail & Related papers (2025-02-06T17:01:38Z)
Outcome-Refining Process Supervision for Code Generation [28.6680126802249]
Large Language Models struggle with complex programming tasks that require deep algorithmic reasoning. We propose Outcome-Refining Process Supervision, a novel paradigm that treats outcome refinement itself as the process to be supervised. Our approach achieves significant improvements across 5 models and 3 datasets: an average of 26.9% increase in correctness and 42.2% in efficiency.
arXiv Detail & Related papers (2024-12-19T17:59:42Z)
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning. LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors. We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z)
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo [55.452453947359736]
We introduce a novel verification method based on Twisted Sequential Monte Carlo (TSMC) We apply TSMC to Large Language Models by estimating the expected future rewards at partial solutions. This approach results in a more straightforward training target that eliminates the need for step-wise human annotations.
arXiv Detail & Related papers (2024-10-02T18:17:54Z)
AGORA: Open More and Trust Less in Binary Verification Service [16.429846973928512]
We introduce a novel binary verification service, AGORA, scrupulously designed to overcome the challenge. Certain tasks can be delegated to untrusted entities, while the corresponding validators are securely housed within the trusted computing base. Through a novel blockchain-based bounty task manager, it also utilizes crowdsourcing to remove trust in theorem provers.
arXiv Detail & Related papers (2024-07-21T05:29:22Z)
HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection [66.42229859018775]
We introduce a unified, high-capacity weakly supervised object detection (WSOD) network called HUWSOD. HUWSOD incorporates a self-supervised proposal generator and an autoencoder proposal generator with a multi-rate re-supervised pyramid to replace traditional object proposals. Our findings indicate that randomly boxes, although significantly different from well-designed offline object proposals, are effective for WSOD training.
arXiv Detail & Related papers (2024-06-27T17:59:49Z)
Finding Software Vulnerabilities in Open-Source C Projects via Bounded Model Checking [2.9129603096077332]
We advocate that bounded model-checking techniques can efficiently detect vulnerabilities in general software systems. We have developed and evaluated a methodology to verify large software systems using a state-of-the-art bounded model checker.
arXiv Detail & Related papers (2023-11-09T11:25:24Z)
Using Machine Learning To Identify Software Weaknesses From Software Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications. Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z)
Benchopt: Reproducible, efficient and collaborative optimization benchmarks [67.29240500171532]
Benchopt is a framework to automate, reproduce and publish optimization benchmarks in machine learning. Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments.
arXiv Detail & Related papers (2022-06-27T16:19:24Z)
Learning from Self-Sampled Correct and Partially-Correct Programs [96.66452896657991]
We propose to let the model perform sampling during training and learn from both self-sampled fully-correct programs and partially-correct programs. We show that our use of self-sampled correct and partially-correct programs can benefit learning and help guide the sampling process. Our proposed method improves the pass@k performance by 3.1% to 12.3% compared to learning from a single reference program with MLE.
arXiv Detail & Related papers (2022-05-28T03:31:07Z)
Part-X: A Family of Stochastic Algorithms for Search-Based Test Generation with Probabilistic Guarantees [3.9119084077397863]
falsification has proven to be a practical and effective method for discovering erroneous behaviors in Cyber-Physical Systems. Despite the constant improvements on the performance and applicability of falsification methods, they all share a common characteristic. They are best-effort methods which do not provide any guarantees on the absence of erroneous behaviors (falsifiers) when the testing budget is exhausted.
arXiv Detail & Related papers (2021-10-20T19:05:00Z)
Test case prioritization using test case diversification and fault-proneness estimations [0.0]
We propose an approach for TCP that takes into account test case coverage data, bug history, and test case diversification. The diversification of test cases is preserved by incorporating fault-proneness on a clustering-based approach scheme. The experiments show that the proposed methods are superior to coverage-based TCP methods.
arXiv Detail & Related papers (2021-06-19T15:55:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.