Related papers: Developing a High-Performance Process Mining Library with Java and Python Bindings in Rust

Developing a High-Performance Process Mining Library with Java and Python Bindings in Rust

URL: http://arxiv.org/abs/2401.14149v1
Date: Thu, 25 Jan 2024 12:59:13 GMT
Title: Developing a High-Performance Process Mining Library with Java and Python Bindings in Rust
Authors: Aaron K\"usters, Wil M.P. van der Aalst
Abstract summary: Rust emerged as a highly performant, compiled programming language with inherent memory safety. By facilitating interoperability, our methodology enables researchers or industry to develop novel algorithms in Rust once and make them accessible to the entire community.
Score: 0.19036571490366497
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The most commonly used open-source process mining software tools today are ProM and PM4Py, written in Java and Python, respectively. Such high-level, often interpreted, programming languages trade off performance with memory safety and ease-of-use. In contrast, traditional compiled languages, like C or C++, can achieve top performance but often suffer from instability related to unsafe memory management. Lately, Rust emerged as a highly performant, compiled programming language with inherent memory safety. In this paper, we describe our approach to developing a shared process mining library in Rust with bindings to both Java and Python, allowing full integration into the existing ecosystems, like ProM and PM4Py. By facilitating interoperability, our methodology enables researchers or industry to develop novel algorithms in Rust once and make them accessible to the entire community while also achieving superior performance.

Related papers

Rust vs. C for Python Libraries: Evaluating Rust-Compatible Bindings Toolchains [2.1984302611206537]
This study evaluates the performance and ease of use of the PyO3 Python bindings toolchain for Rust against ctypes and cffi.<n>By using Rust tooling developed for Python, we can achieve state-of-the-art performance with no concern for API compatibility.
arXiv Detail & Related papers (2025-06-30T21:14:20Z)
CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation [63.23120252801889]
CRUST-Bench is a dataset of 100 C repositories, each paired with manually-written interfaces in safe Rust as well as test cases. We evaluate state-of-the-art large language models (LLMs) on this task and find that safe and idiomatic Rust generation is still a challenging problem. The best performing model, OpenAI o1, is able to solve only 15 tasks in a single-shot setting.
arXiv Detail & Related papers (2025-04-21T17:33:33Z)
CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution [50.7413285637879]
The CRUXEVAL-X code reasoning benchmark contains 19 programming languages. It comprises at least 600 subjects for each language, along with 19K content-consistent tests in total. Even a model trained solely on Python can achieve at most 34.4% Pass@1 in other languages.
arXiv Detail & Related papers (2024-08-23T11:43:00Z)
VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners [6.824327908701066]
Rust is a programming language that combines memory safety and low-level control, providing C-like performance. Existing work falls into two categories: rule-based and large language model (LLM)-based. We present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness.
arXiv Detail & Related papers (2024-04-29T16:45:03Z)
A Study of Undefined Behavior Across Foreign Function Boundaries in Rust Libraries [2.359557447960552]
Rust is frequently used to interoperate with other languages. Miri is the only dynamic analysis tool capable of validating applications against these models. Miri does not support foreign functions, indicating that there may be a critical correctness gap at the heart of the Rust ecosystem.
arXiv Detail & Related papers (2024-04-17T18:12:05Z)
Towards a Transpiler for C/C++ to Safer Rust [0.10993800728351737]
Rust is a programming language developed by Mozilla that focuses on performance and safety. How to convert an existing C++ code base to Rust is also gaining greater attention.
arXiv Detail & Related papers (2024-01-16T10:35:59Z)
LILO: Learning Interpretable Libraries by Compressing and Documenting Code [71.55208585024198]
We introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code. LILO combines LLM-guided program synthesis with recent algorithmic advances in automated from Stitch. We find that AutoDoc boosts performance by helping LILO's synthesizer to interpret and deploy learned abstractions.
arXiv Detail & Related papers (2023-10-30T17:55:02Z)
Fast Summary-based Whole-program Analysis to Identify Unsafe Memory Accesses in Rust [23.0568924498396]
Rust is one of the most promising systems programming languages to solve the memory safety issues that have plagued low-level software for over forty years. unsafe Rust code and directly-linked unsafe foreign libraries may not only introduce memory safety violations themselves but also compromise the entire program as they run in the same monolithic address space as the safe Rust. We have prototyped a whole-program analysis for identifying both unsafe heap allocations and memory accesses to those unsafe heap objects.
arXiv Detail & Related papers (2023-10-16T11:34:21Z)
Fixing Rust Compilation Errors using LLMs [2.1781086368581932]
The Rust programming language has established itself as a viable choice for low-level systems programming language over the traditional, unsafe alternatives like C/C++. This paper presents a tool called RustAssistant that leverages the emergent capabilities of Large Language Models (LLMs) to automatically suggest fixes for Rust compilation errors. RustAssistant is able to achieve an impressive peak accuracy of roughly 74% on real-world compilation errors in popular open-source Rust repositories.
arXiv Detail & Related papers (2023-08-09T18:30:27Z)
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment. Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution. We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z)
LongCoder: A Long-Range Pre-trained Language Model for Code Completion [56.813974784131624]
LongCoder employs a sliding window mechanism for self-attention and introduces two types of globally accessible tokens. Bridge tokens are inserted throughout the input sequence to aggregate local information and facilitate global interaction. memory tokens are included to highlight important statements that may be invoked later and need to be memorized.
arXiv Detail & Related papers (2023-06-26T17:59:24Z)
A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models. We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.