FORLAPS: An Innovative Data-Driven Reinforcement Learning Approach for Prescriptive Process Monitoring
- URL: http://arxiv.org/abs/2501.10543v1
- Date: Fri, 17 Jan 2025 20:31:35 GMT
- Title: FORLAPS: An Innovative Data-Driven Reinforcement Learning Approach for Prescriptive Process Monitoring
- Authors: Mostafa Abbasi, Maziyar Khadivi, Maryam Ahang, Patricia Lasserre, Yves Lucet, Homayoun Najjaran,
- Abstract summary: This study introduces an innovative evaluation model, benchmarking its performance against earlier works using nine publicly available datasets.
The proposed model, FORLAPS, demonstrated exceptional performance, outperforming existing state-of-the-art approaches in suggesting the most optimal policies or predicting the best next activities within a process trace.
- Score: 3.4437362489150254
- License:
- Abstract: We present a novel 5-step framework called Fine-Tuned Offline Reinforcement Learning Augmented Process Sequence Optimization (FORLAPS), which aims to identify optimal execution paths in business processes using reinforcement learning. We implemented this approach on real-life event logs from our case study an energy regulator in Canada and other real-life event logs, demonstrating the feasibility of the proposed method. Additionally, to compare FORLAPS with the existing models (Permutation Feature Importance and multi-task LSTM-Based model), we experimented to evaluate its effectiveness in terms of resource savings and process time span reduction. The experimental results on real-life event log validate that FORLAPS achieves 31% savings in resource time spent and a 23% reduction in process time span. Using this innovative data augmentation technique, we propose a fine-tuned reinforcement learning approach that aims to automatically fine-tune the model by selectively increasing the average estimated Q-value in the sampled batches. The results show that we obtained a 44% performance improvement compared to the pre-trained model. This study introduces an innovative evaluation model, benchmarking its performance against earlier works using nine publicly available datasets. Robustness is ensured through experiments utilizing the Damerau-Levenshtein distance as the primary metric. In addition, we discussed the suitability of datasets, taking into account their inherent properties, to evaluate the performance of different models. The proposed model, FORLAPS, demonstrated exceptional performance, outperforming existing state-of-the-art approaches in suggesting the most optimal policies or predicting the best next activities within a process trace.
Related papers
- Dynamic Noise Preference Optimization for LLM Self-Improvement via Synthetic Data [51.62162460809116]
We introduce Dynamic Noise Preference Optimization (DNPO) to ensure consistent improvements across iterations.
In experiments with Zephyr-7B, DNPO consistently outperforms existing methods, showing an average performance boost of 2.6%.
DNPO shows a significant improvement in model-generated data quality, with a 29.4% win-loss rate gap compared to the baseline in GPT-4 evaluations.
arXiv Detail & Related papers (2025-02-08T01:20:09Z) - The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities.
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models.
Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z) - Efficient Continual Pre-training by Mitigating the Stability Gap [68.49269649759005]
We study the behavior of Large Language Models (LLMs) during continual pre-training.
We propose three effective strategies to enhance LLM performance within a fixed compute budget.
Our strategies improve the average medical task performance of the OpenLlama-3B model from 36.2% to 40.7% with only 40% of the original training budget.
arXiv Detail & Related papers (2024-06-21T02:28:37Z) - Parameter-Efficient Active Learning for Foundational models [7.799711162530711]
Foundational vision transformer models have shown impressive few shot performance on many vision tasks.
This research presents a novel investigation into the application of parameter efficient fine-tuning methods within an active learning (AL) framework.
arXiv Detail & Related papers (2024-06-13T16:30:32Z) - Refining 3D Point Cloud Normal Estimation via Sample Selection [13.207964615561261]
We introduce a fundamental framework for normal estimation, enhancing existing model through the incorporation of global information and various constraint mechanisms.
We also utilize existing orientation methods to correct estimated non-oriented normals, achieving state-of-the-art performance in both oriented and non-oriented tasks.
arXiv Detail & Related papers (2024-05-20T02:06:10Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration [15.463313629574111]
This paper investigates how to achieve sample-efficient exploration in continuous control tasks.
We introduce an RL algorithm that incorporates a predictive model and off-policy learning elements.
We derive an intrinsic reward without incurring parameters overhead.
arXiv Detail & Related papers (2024-03-31T11:39:11Z) - L3 Ensembles: Lifelong Learning Approach for Ensemble of Foundational
Language Models [15.726224465017596]
We propose an approach that focuses on extracting meaningful representations from unseen data and constructing a structured knowledge base.
We conducted experiments on various NLP tasks to validate its effectiveness, including benchmarks like GLUE and SuperGLUE.
The proposed L3 ensemble method increases the model accuracy by 4% 36% compared to the fine-tuned FLM.
arXiv Detail & Related papers (2023-11-11T06:59:50Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - Posterior Sampling for Deep Reinforcement Learning [0.0]
This paper introduces Posterior Sampling for Deep Reinforcement Learning (PSDRL), the first truly scalable approximation of Posterior Sampling for Reinforcement Learning.
Experiments on the Atari benchmark show that PSDRL significantly outperforms previous state-of-the-art attempts at scaling up posterior sampling.
arXiv Detail & Related papers (2023-04-30T13:23:50Z) - On Effective Scheduling of Model-based Reinforcement Learning [53.027698625496015]
We propose a framework named AutoMBPO to automatically schedule the real data ratio.
In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance.
arXiv Detail & Related papers (2021-11-16T15:24:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.