Towards Sequence Utility Maximization under Utility Occupancy Measure
- URL: http://arxiv.org/abs/2212.10452v1
- Date: Tue, 20 Dec 2022 17:28:53 GMT
- Title: Towards Sequence Utility Maximization under Utility Occupancy Measure
- Authors: Gengsen Huang, Wensheng Gan, and Philip S. Yu
- Abstract summary: In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing.
We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining.
An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
- Score: 53.234101208024335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The discovery of utility-driven patterns is a useful and difficult research
topic. It can extract significant and interesting information from specific and
varied databases, increasing the value of the services provided. In practice,
the measure of utility is often used to demonstrate the importance, profit, or
risk of an object or a pattern. In the database, although utility is a flexible
criterion for each pattern, it is a more absolute criterion due to the neglect
of utility sharing. This leads to the derived patterns only exploring partial
and local knowledge from a database. Utility occupancy is a recently proposed
model that considers the problem of mining with high utility but low occupancy.
However, existing studies are concentrated on itemsets that do not reveal the
temporal relationship of object occurrences. Therefore, this paper towards
sequence utility maximization. We first define utility occupancy on sequence
data and raise the problem of High Utility-Occupancy Sequential Pattern Mining
(HUOSPM). Three dimensions, including frequency, utility, and occupancy, are
comprehensively evaluated in HUOSPM. An algorithm called Sequence Utility
Maximization with Utility occupancy measure (SUMU) is proposed. Furthermore,
two data structures for storing related information about a pattern,
Utility-Occupancy-List-Chain (UOL-Chain) and Utility-Occupancy-Table (UO-Table)
with six associated upper bounds, are designed to improve efficiency. Empirical
experiments are carried out to evaluate the novel algorithm's efficiency and
effectiveness. The influence of different upper bounds and pruning strategies
is analyzed and discussed. The comprehensive results suggest that the work of
our algorithm is intelligent and effective.
Related papers
- Scalable Sampling for High Utility Patterns [1.2154569665167423]
We propose a novel high utility pattern sampling algorithm and its on-disk version for large quantitative databases.
Our approach ensures both the interactivity required for user-centered methods and strong statistical guarantees through random sampling.
To demonstrate the interest of our approach, we present a compelling use case involving archaeological knowledge graph sub-profiles discovery.
arXiv Detail & Related papers (2024-10-30T12:22:54Z) - ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization [52.5587113539404]
We introduce a causality-aware entropy term that effectively identifies and prioritizes actions with high potential impacts for efficient exploration.
Our proposed algorithm, ACE: Off-policy Actor-critic with Causality-aware Entropy regularization, demonstrates a substantial performance advantage across 29 diverse continuous control tasks.
arXiv Detail & Related papers (2024-02-22T13:22:06Z) - HUSP-SP: Faster Utility Mining on Sequence Data [48.0426095077918]
High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity.
We design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely discovering high-utility sequential patterns with the seqPro structure (HUSP-SP)
Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability.
arXiv Detail & Related papers (2022-12-29T10:56:17Z) - A Generic Algorithm for Top-K On-Shelf Utility Mining [47.729883172648876]
On-shelf utility mining (OSUM) is an emerging research direction in data mining.
It aims to discover itemsets that have high relative utility in their selling time period.
It is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets.
We propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns.
arXiv Detail & Related papers (2022-08-27T03:08:00Z) - Itemset Utility Maximization with Correlation Measure [8.581840054840335]
High utility itemset mining (HUIM) is used to find out interesting but hidden information (e.g., profit and risk)
In this paper, we propose a novel algorithm called the Itemset Utility Maximization with Correlation Measure (CoIUM)
Two upper bounds and four pruning strategies are utilized to effectively prune the search space. And a concise array-based structure named utility-bin is used to calculate and store the adopted upper bounds in linear time and space.
arXiv Detail & Related papers (2022-08-26T10:06:24Z) - Temporal Fuzzy Utility Maximization with Remaining Measure [1.642022526257133]
We propose a novel one-phase temporal fuzzy utility itemset mining approach called TFUM.
TFUM revises temporal fuzzy-lists to maintain less but major information about potential high temporal fuzzy utility itemsets in memory.
It then discovers a complete set of real interesting patterns in a short time.
arXiv Detail & Related papers (2022-08-26T05:09:56Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - US-Rule: Discovering Utility-driven Sequential Rules [52.68017415747925]
We propose a faster algorithm, called US-Rule, to efficiently mine high-utility sequential rules.
Four tighter upper bounds (LEEU, REEU, LERSU, RERSU) and their corresponding pruning strategies are proposed.
US-Rule can achieve better performance in terms of execution time, memory consumption and scalability.
arXiv Detail & Related papers (2021-11-29T23:38:28Z) - Flexible Pattern Discovery and Analysis [2.075126998649103]
We introduce an algorithm for the mining of flexible high utility-occupancy patterns.
The proposed algorithm can effectively control the length of the derived patterns, for both real-world and synthetic datasets.
arXiv Detail & Related papers (2021-11-24T01:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.