Related papers: Towards Target High-Utility Itemsets

Towards Target High-Utility Itemsets

URL: http://arxiv.org/abs/2206.06157v1
Date: Thu, 9 Jun 2022 18:42:58 GMT
Title: Towards Target High-Utility Itemsets
Authors: Jinbao Miao, Wensheng Gan, Shicheng Wan, Yongdong Wu, Philippe Fournier-Viger
Abstract summary: In applied intelligence, utility-driven pattern discovery algorithms can identify insightful and useful patterns in databases. Targeted high-utility itemset mining has emerged as a key research topic. We propose THUIM (Targeted High-Utility Itemset Mining), which can quickly match high-utility itemsets during the mining process to select the targeted patterns.
Score: 2.824395407508717
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: For applied intelligence, utility-driven pattern discovery algorithms can identify insightful and useful patterns in databases. However, in these techniques for pattern discovery, the number of patterns can be huge, and the user is often only interested in a few of those patterns. Hence, targeted high-utility itemset mining has emerged as a key research topic, where the aim is to find a subset of patterns that meet a targeted pattern constraint instead of all patterns. This is a challenging task because efficiently finding tailored patterns in a very large search space requires a targeted mining algorithm. A first algorithm called TargetUM has been proposed, which adopts an approach similar to post-processing using a tree structure, but the running time and memory consumption are unsatisfactory in many situations. In this paper, we address this issue by proposing a novel list-based algorithm with pattern matching mechanism, named THUIM (Targeted High-Utility Itemset Mining), which can quickly match high-utility itemsets during the mining process to select the targeted patterns. Extensive experiments were conducted on different datasets to compare the performance of the proposed algorithm with state-of-the-art algorithms. Results show that THUIM performs very well in terms of runtime and memory consumption, and has good scalability compared to TargetUM.

Related papers

Nearly Optimal Active Preference Learning and Its Application to LLM Alignment [68.56793807995417]
Aligning large language models depends on high-quality datasets of human preference labels.<n>Many existing approaches adopt classical experimental design criteria such as G- or D-optimality.<n>In this work, we identify a simple intuition specific to preference learning that calls into question the suitability of these existing design objectives.
arXiv Detail & Related papers (2026-02-02T03:21:29Z)
Machine Learning Techniques for Pattern Recognition in High-Dimensional Data Mining [0.0]
This paper proposes a frequent pattern data mining algorithm based on support vector machine (SVM) By converting the frequent pattern mining task into a classification problem, the SVM model is introduced to improve the accuracy and robustness of pattern extraction. The experiment shows that the SVM model has excellent performance advantages in an environment with high data sparsity and a large number of transactions.
arXiv Detail & Related papers (2024-12-20T06:32:05Z)
Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing. We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining. An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z)
Towards Correlated Sequential Rules [4.743965372344134]
High-utility sequential rule mining (HUSRM) is designed to explore the confidence or probability of predicting the occurrence of consequence sequential patterns. The existing algorithm, known as HUSRM, is limited to extracting all eligible rules while neglecting the correlation between the generated sequential rules. We propose a novel algorithm called correlated high-utility sequential rule miner (CoUSR) to integrate the concept of correlation into HUSRM.
arXiv Detail & Related papers (2022-10-27T17:27:23Z)
Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z)
A Generic Algorithm for Top-K On-Shelf Utility Mining [47.729883172648876]
On-shelf utility mining (OSUM) is an emerging research direction in data mining. It aims to discover itemsets that have high relative utility in their selling time period. It is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets. We propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns.
arXiv Detail & Related papers (2022-08-27T03:08:00Z)
TaSPM: Targeted Sequential Pattern Mining [53.234101208024335]
We propose a generic framework namely TaSPM, based on the fast CM-SPAM algorithm. We also propose several pruning strategies to reduce meaningless operations in mining processes. Experiments show that the novel targeted mining algorithm TaSPM can achieve faster running time and less memory consumption.
arXiv Detail & Related papers (2022-02-26T17:49:47Z)
Flexible Pattern Discovery and Analysis [2.075126998649103]
We introduce an algorithm for the mining of flexible high utility-occupancy patterns. The proposed algorithm can effectively control the length of the derived patterns, for both real-world and synthetic datasets.
arXiv Detail & Related papers (2021-11-24T01:25:15Z)
Pre-Clustering Point Clouds of Crop Fields Using Scalable Methods [14.06711982797654]
We show a similarity between the current state-of-the-art for this problem and a commonly used density-based clustering algorithm, Quickshift. We propose a number of novel, application specific algorithms with the goal of producing a general and scalable plant segmentation algorithm. When incorporated into field-scale phenotyping systems, the proposed algorithms should work as a drop in replacement that can greatly improve the accuracy of results.
arXiv Detail & Related papers (2021-07-22T22:47:22Z)
Automated Decision-based Adversarial Attacks [48.01183253407982]
We consider the practical and challenging decision-based black-box adversarial setting. Under this setting, the attacker can only acquire the final classification labels by querying the target model. We propose to automatically discover decision-based adversarial attack algorithms.
arXiv Detail & Related papers (2021-05-09T13:15:10Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.