US-Rule: Discovering Utility-driven Sequential Rules
- URL: http://arxiv.org/abs/2111.15020v1
- Date: Mon, 29 Nov 2021 23:38:28 GMT
- Title: US-Rule: Discovering Utility-driven Sequential Rules
- Authors: Gengsen Huang, Wensheng Gan, Jian Weng, and Philip S. Yu
- Abstract summary: We propose a faster algorithm, called US-Rule, to efficiently mine high-utility sequential rules.
Four tighter upper bounds (LEEU, REEU, LERSU, RERSU) and their corresponding pruning strategies are proposed.
US-Rule can achieve better performance in terms of execution time, memory consumption and scalability.
- Score: 52.68017415747925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Utility-driven mining is an important task in data science and has many
applications in real life. High utility sequential pattern mining (HUSPM) is
one kind of utility-driven mining. HUSPM aims to discover all sequential
patterns with high utility. However, the existing algorithms of HUSPM can not
provide an accurate probability to deal with some scenarios for prediction or
recommendation. High-utility sequential rule mining (HUSRM) was proposed to
discover all sequential rules with high utility and high confidence. There is
only one algorithm proposed for HUSRM, which is not enough efficient. In this
paper, we propose a faster algorithm, called US-Rule, to efficiently mine
high-utility sequential rules. It utilizes rule estimated utility co-occurrence
pruning strategy (REUCP) to avoid meaningless computation. To improve the
efficiency on dense and long sequence datasets, four tighter upper bounds
(LEEU, REEU, LERSU, RERSU) and their corresponding pruning strategies (LEEUP,
REEUP, LERSUP, RERSUP) are proposed. Besides, US-Rule proposes rule estimated
utility recomputing pruning strategy (REURP) to deal with sparse datasets. At
last, a large number of experiments on different datasets compared to the
state-of-the-art algorithm demonstrate that US-Rule can achieve better
performance in terms of execution time, memory consumption and scalability.
Related papers
- HUSP-SP: Faster Utility Mining on Sequence Data [48.0426095077918]
High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity.
We design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely discovering high-utility sequential patterns with the seqPro structure (HUSP-SP)
Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability.
arXiv Detail & Related papers (2022-12-29T10:56:17Z) - Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing.
We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining.
An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z) - Towards Correlated Sequential Rules [4.743965372344134]
High-utility sequential rule mining (HUSRM) is designed to explore the confidence or probability of predicting the occurrence of consequence sequential patterns.
The existing algorithm, known as HUSRM, is limited to extracting all eligible rules while neglecting the correlation between the generated sequential rules.
We propose a novel algorithm called correlated high-utility sequential rule miner (CoUSR) to integrate the concept of correlation into HUSRM.
arXiv Detail & Related papers (2022-10-27T17:27:23Z) - Totally-ordered Sequential Rules for Utility Maximization [49.57003933142011]
We propose two novel algorithms, called TotalSR and TotalSR+, which aim to identify all high utility totally-ordered sequential rules (HTSRs)
TotalSR creates a utility table that can efficiently calculate antecedent support and a utility prefix sum list that can compute the remaining utility in O(1) time for a sequence.
There are numerous experimental results on both real and synthetic datasets demonstrating that TotalSR is significantly more efficient than algorithms with fewer pruning strategies.
arXiv Detail & Related papers (2022-09-27T16:17:58Z) - A Generic Algorithm for Top-K On-Shelf Utility Mining [47.729883172648876]
On-shelf utility mining (OSUM) is an emerging research direction in data mining.
It aims to discover itemsets that have high relative utility in their selling time period.
It is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets.
We propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns.
arXiv Detail & Related papers (2022-08-27T03:08:00Z) - Towards Target Sequential Rules [52.4562332499155]
We propose an efficient algorithm, called targeted sequential rule mining (TaSRM)
It is shown that the novel algorithm TaSRM and its variants can achieve better experimental performance compared to the existing baseline algorithm.
arXiv Detail & Related papers (2022-06-09T18:59:54Z) - TaSPM: Targeted Sequential Pattern Mining [53.234101208024335]
We propose a generic framework namely TaSPM, based on the fast CM-SPAM algorithm.
We also propose several pruning strategies to reduce meaningless operations in mining processes.
Experiments show that the novel targeted mining algorithm TaSPM can achieve faster running time and less memory consumption.
arXiv Detail & Related papers (2022-02-26T17:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.