Related papers: Towards Sequence Utility Maximization under Utility Occupancy Measure

Towards Sequence Utility Maximization under Utility Occupancy Measure

URL: http://arxiv.org/abs/2212.10452v1
Date: Tue, 20 Dec 2022 17:28:53 GMT
Title: Towards Sequence Utility Maximization under Utility Occupancy Measure
Authors: Gengsen Huang, Wensheng Gan, and Philip S. Yu
Abstract summary: In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing. We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining. An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
Score: 53.234101208024335
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The discovery of utility-driven patterns is a useful and difficult research topic. It can extract significant and interesting information from specific and varied databases, increasing the value of the services provided. In practice, the measure of utility is often used to demonstrate the importance, profit, or risk of an object or a pattern. In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to the neglect of utility sharing. This leads to the derived patterns only exploring partial and local knowledge from a database. Utility occupancy is a recently proposed model that considers the problem of mining with high utility but low occupancy. However, existing studies are concentrated on itemsets that do not reveal the temporal relationship of object occurrences. Therefore, this paper towards sequence utility maximization. We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining (HUOSPM). Three dimensions, including frequency, utility, and occupancy, are comprehensively evaluated in HUOSPM. An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed. Furthermore, two data structures for storing related information about a pattern, Utility-Occupancy-List-Chain (UOL-Chain) and Utility-Occupancy-Table (UO-Table) with six associated upper bounds, are designed to improve efficiency. Empirical experiments are carried out to evaluate the novel algorithm's efficiency and effectiveness. The influence of different upper bounds and pruning strategies is analyzed and discussed. The comprehensive results suggest that the work of our algorithm is intelligent and effective.

Related papers

Efficient Conformance Checking of Rich Data-Aware Declare Specifications (Extended) [49.46686813437884]
We show that it is possible to compute data-aware optimal alignments in a rich setting with general data types and data conditions.<n>This is achieved by carefully combining the two best-known approaches to deal with control flow and data dependencies.
arXiv Detail & Related papers (2025-06-30T10:16:21Z)
Scalable Sampling for High Utility Patterns [1.2154569665167423]
We propose a novel high utility pattern sampling algorithm and its on-disk version for large quantitative databases. Our approach ensures both the interactivity required for user-centered methods and strong statistical guarantees through random sampling. To demonstrate the interest of our approach, we present a compelling use case involving archaeological knowledge graph sub-profiles discovery.
arXiv Detail & Related papers (2024-10-30T12:22:54Z)
ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization [52.5587113539404]
We introduce a causality-aware entropy term that effectively identifies and prioritizes actions with high potential impacts for efficient exploration. Our proposed algorithm, ACE: Off-policy Actor-critic with Causality-aware Entropy regularization, demonstrates a substantial performance advantage across 29 diverse continuous control tasks.
arXiv Detail & Related papers (2024-02-22T13:22:06Z)
HUSP-SP: Faster Utility Mining on Sequence Data [48.0426095077918]
High-utility sequential pattern mining (HUSPM) has emerged as an important topic due to its wide application and considerable popularity. We design a compact structure called sequence projection (seqPro) and propose an efficient algorithm, namely discovering high-utility sequential patterns with the seqPro structure (HUSP-SP) Experimental results on both synthetic and real-life datasets show that HUSP-SP can significantly outperform the state-of-the-art algorithms in terms of running time, memory usage, search space pruning efficiency, and scalability.
arXiv Detail & Related papers (2022-12-29T10:56:17Z)
A Generic Algorithm for Top-K On-Shelf Utility Mining [47.729883172648876]
On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. It is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. We propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns.
arXiv Detail & Related papers (2022-08-27T03:08:00Z)
Itemset Utility Maximization with Correlation Measure [8.581840054840335]
High utility itemset mining (HUIM) is used to find out interesting but hidden information (e.g., profit and risk) In this paper, we propose a novel algorithm called the Itemset Utility Maximization with Correlation Measure (CoIUM) Two upper bounds and four pruning strategies are utilized to effectively prune the search space. And a concise array-based structure named utility-bin is used to calculate and store the adopted upper bounds in linear time and space.
arXiv Detail & Related papers (2022-08-26T10:06:24Z)
Temporal Fuzzy Utility Maximization with Remaining Measure [1.642022526257133]
We propose a novel one-phase temporal fuzzy utility itemset mining approach called TFUM. TFUM revises temporal fuzzy-lists to maintain less but major information about potential high temporal fuzzy utility itemsets in memory. It then discovers a complete set of real interesting patterns in a short time.
arXiv Detail & Related papers (2022-08-26T05:09:56Z)
Compactness Score: A Fast Filter Method for Unsupervised Feature Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features. Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z)
US-Rule: Discovering Utility-driven Sequential Rules [52.68017415747925]
We propose a faster algorithm, called US-Rule, to efficiently mine high-utility sequential rules. Four tighter upper bounds (LEEU, REEU, LERSU, RERSU) and their corresponding pruning strategies are proposed. US-Rule can achieve better performance in terms of execution time, memory consumption and scalability.
arXiv Detail & Related papers (2021-11-29T23:38:28Z)
Flexible Pattern Discovery and Analysis [2.075126998649103]
We introduce an algorithm for the mining of flexible high utility-occupancy patterns. The proposed algorithm can effectively control the length of the derived patterns, for both real-world and synthetic datasets.
arXiv Detail & Related papers (2021-11-24T01:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.