Using Sampling Strategy to Assist Consensus Sequence Analysis
- URL: http://arxiv.org/abs/2008.08300v2
- Date: Thu, 1 Jul 2021 23:03:06 GMT
- Title: Using Sampling Strategy to Assist Consensus Sequence Analysis
- Authors: Zhichao Xu, Shuhong Chen
- Abstract summary: We propose a novel sampling strategy to determine the number of traces necessary to produce a representative consensus sequence.
We show how to estimate the difference between the predefined Expert Model and the real processes carried out.
- Score: 3.983901161231557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consensus Sequences of event logs are often used in process mining to quickly
grasp the core sequence of events to be performed in a process, or to represent
the backbone of the process for doing other analyses. However, it is still not
clear how many traces are enough to properly represent the underlying process.
In this paper, we propose a novel sampling strategy to determine the number of
traces necessary to produce a representative consensus sequence. We show how to
estimate the difference between the predefined Expert Model and the real
processes carried out. This difference level can be used as reference for
domain experts to adjust the Expert Model. In addition, we apply this strategy
to several real-world workflow activity datasets as a case study. We show a
sample curve fitting task to help readers better understand our proposed
methodology.
Related papers
- Scalable Signature-Based Distribution Regression via Reference Sets [1.8980236415886387]
Path signatures are used to leverage the information encoded in paths via signature-based features.
Current state of the art DR solutions are memory intensive and incur a high cost.
This computational bottleneck limits the application to small sample sizes.
We present a methodology for addressing the above issues; resolving estimation uncertainties.
We also propose a pipeline that enables us to use DR for a wide variety of learning tasks.
arXiv Detail & Related papers (2024-10-11T18:58:28Z) - Mining a Minimal Set of Behavioral Patterns using Incremental Evaluation [3.16536213610547]
Existing approaches to behavioral pattern mining suffer from two limitations.
First, they show limited scalability as incremental computation is incorporated only in the generation of pattern candidates.
Second, process analysis based on mined patterns shows limited effectiveness due to an overwhelmingly large number of patterns obtained in practical application scenarios.
arXiv Detail & Related papers (2024-02-05T11:41:37Z) - Sample and Predict Your Latent: Modality-free Sequential Disentanglement
via Contrastive Estimation [2.7759072740347017]
We introduce a self-supervised sequential disentanglement framework based on contrastive estimation with no external signals.
In practice, we propose a unified, efficient, and easy-to-code sampling strategy for semantically similar and dissimilar views of the data.
Our method presents state-of-the-art results in comparison to existing techniques.
arXiv Detail & Related papers (2023-05-25T10:50:30Z) - Trace Encoding in Process Mining: a survey and benchmarking [0.34410212782758054]
Methods are employed across several process mining tasks, including predictive process monitoring, anomalous case detection, clustering trace, etc.
Most papers choose existing encoding methods arbitrarily or employ a strategy based on a specific expert knowledge domain.
This work aims at providing a comprehensive survey on event log encoding by comparing 27 methods.
arXiv Detail & Related papers (2023-01-05T17:25:30Z) - FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable.
We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z) - Beyond Farthest Point Sampling in Point-Wise Analysis [52.218037492342546]
We propose a novel data-driven sampler learning strategy for point-wise analysis tasks.
We learn sampling and downstream applications jointly.
Our experiments show that jointly learning of the sampler and task brings remarkable improvement over previous baseline methods.
arXiv Detail & Related papers (2021-07-09T08:08:44Z) - Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.
In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand.
We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z) - Subtask Analysis of Process Data Through a Predictive Model [5.7668512557707166]
This paper develops a computationally efficient method for exploratory analysis of such process data.
The new approach segments a lengthy individual process into a sequence of short subprocesses to achieve complexity reduction.
We use the process data from PIAAC 2012 to demonstrate how exploratory analysis of process data can be done with the new approach.
arXiv Detail & Related papers (2020-08-29T21:11:01Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z) - CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements.
In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data.
For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.