Towards Target High-Utility Itemsets
- URL: http://arxiv.org/abs/2206.06157v1
- Date: Thu, 9 Jun 2022 18:42:58 GMT
- Title: Towards Target High-Utility Itemsets
- Authors: Jinbao Miao, Wensheng Gan, Shicheng Wan, Yongdong Wu, Philippe
Fournier-Viger
- Abstract summary: In applied intelligence, utility-driven pattern discovery algorithms can identify insightful and useful patterns in databases.
Targeted high-utility itemset mining has emerged as a key research topic.
We propose THUIM (Targeted High-Utility Itemset Mining), which can quickly match high-utility itemsets during the mining process to select the targeted patterns.
- Score: 2.824395407508717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For applied intelligence, utility-driven pattern discovery algorithms can
identify insightful and useful patterns in databases. However, in these
techniques for pattern discovery, the number of patterns can be huge, and the
user is often only interested in a few of those patterns. Hence, targeted
high-utility itemset mining has emerged as a key research topic, where the aim
is to find a subset of patterns that meet a targeted pattern constraint instead
of all patterns. This is a challenging task because efficiently finding
tailored patterns in a very large search space requires a targeted mining
algorithm. A first algorithm called TargetUM has been proposed, which adopts an
approach similar to post-processing using a tree structure, but the running
time and memory consumption are unsatisfactory in many situations. In this
paper, we address this issue by proposing a novel list-based algorithm with
pattern matching mechanism, named THUIM (Targeted High-Utility Itemset Mining),
which can quickly match high-utility itemsets during the mining process to
select the targeted patterns. Extensive experiments were conducted on different
datasets to compare the performance of the proposed algorithm with
state-of-the-art algorithms. Results show that THUIM performs very well in
terms of runtime and memory consumption, and has good scalability compared to
TargetUM.
Related papers
- Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing.
We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining.
An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z) - Towards Correlated Sequential Rules [4.743965372344134]
High-utility sequential rule mining (HUSRM) is designed to explore the confidence or probability of predicting the occurrence of consequence sequential patterns.
The existing algorithm, known as HUSRM, is limited to extracting all eligible rules while neglecting the correlation between the generated sequential rules.
We propose a novel algorithm called correlated high-utility sequential rule miner (CoUSR) to integrate the concept of correlation into HUSRM.
arXiv Detail & Related papers (2022-10-27T17:27:23Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - A Generic Algorithm for Top-K On-Shelf Utility Mining [47.729883172648876]
On-shelf utility mining (OSUM) is an emerging research direction in data mining.
It aims to discover itemsets that have high relative utility in their selling time period.
It is hard to define a minimum threshold minutil for mining the right amount of on-shelf high utility itemsets.
We propose a generic algorithm named TOIT for mining Top-k On-shelf hIgh-utility paTterns.
arXiv Detail & Related papers (2022-08-27T03:08:00Z) - TaSPM: Targeted Sequential Pattern Mining [53.234101208024335]
We propose a generic framework namely TaSPM, based on the fast CM-SPAM algorithm.
We also propose several pruning strategies to reduce meaningless operations in mining processes.
Experiments show that the novel targeted mining algorithm TaSPM can achieve faster running time and less memory consumption.
arXiv Detail & Related papers (2022-02-26T17:49:47Z) - Flexible Pattern Discovery and Analysis [2.075126998649103]
We introduce an algorithm for the mining of flexible high utility-occupancy patterns.
The proposed algorithm can effectively control the length of the derived patterns, for both real-world and synthetic datasets.
arXiv Detail & Related papers (2021-11-24T01:25:15Z) - Pre-Clustering Point Clouds of Crop Fields Using Scalable Methods [14.06711982797654]
We show a similarity between the current state-of-the-art for this problem and a commonly used density-based clustering algorithm, Quickshift.
We propose a number of novel, application specific algorithms with the goal of producing a general and scalable plant segmentation algorithm.
When incorporated into field-scale phenotyping systems, the proposed algorithms should work as a drop in replacement that can greatly improve the accuracy of results.
arXiv Detail & Related papers (2021-07-22T22:47:22Z) - Automated Decision-based Adversarial Attacks [48.01183253407982]
We consider the practical and challenging decision-based black-box adversarial setting.
Under this setting, the attacker can only acquire the final classification labels by querying the target model.
We propose to automatically discover decision-based adversarial attack algorithms.
arXiv Detail & Related papers (2021-05-09T13:15:10Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.