Mining Weighted Sequential Patterns in Incremental Uncertain Databases
- URL: http://arxiv.org/abs/2404.00746v1
- Date: Sun, 31 Mar 2024 17:32:08 GMT
- Title: Mining Weighted Sequential Patterns in Incremental Uncertain Databases
- Authors: Kashob Kumar Roy, Md Hasibul Haque Moon, Md Mahmudur Rahman, Chowdhury Farhan Ahmed, Carson Kai-Sang Leung,
- Abstract summary: We have developed an algorithm to mine frequent sequences in an uncertain database.
We have proposed two new techniques for mining when the database is incremental.
- Score: 2.668038211242538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the rapid development of science and technology, the importance of imprecise, noisy, and uncertain data is increasing at an exponential rate. Thus, mining patterns in uncertain databases have drawn the attention of researchers. Moreover, frequent sequences of items from these databases need to be discovered for meaningful knowledge with great impact. In many real cases, weights of items and patterns are introduced to find interesting sequences as a measure of importance. Hence, a constraint of weight needs to be handled while mining sequential patterns. Besides, due to the dynamic nature of databases, mining important information has become more challenging. Instead of mining patterns from scratch after each increment, incremental mining algorithms utilize previously mined information to update the result immediately. Several algorithms exist to mine frequent patterns and weighted sequences from incremental databases. However, these algorithms are confined to mine the precise ones. Therefore, we have developed an algorithm to mine frequent sequences in an uncertain database in this work. Furthermore, we have proposed two new techniques for mining when the database is incremental. Extensive experiments have been conducted for performance evaluation. The analysis showed the efficiency of our proposed framework.
Related papers
- State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era [59.279784235147254]
This survey provides an in-depth summary of the latest approaches that are based on recurrent models for sequential data processing.
The emerging picture suggests that there is room for thinking of novel routes, constituted by learning algorithms which depart from the standard Backpropagation Through Time.
arXiv Detail & Related papers (2024-06-13T12:51:22Z) - Regularization-Based Methods for Ordinal Quantification [49.606912965922504]
We study the ordinal case, i.e., the case in which a total order is defined on the set of n>2 classes.
We propose a novel class of regularized OQ algorithms, which outperforms existing algorithms in our experiments.
arXiv Detail & Related papers (2023-10-13T16:04:06Z) - Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing.
We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining.
An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z) - Research Trends and Applications of Data Augmentation Algorithms [77.34726150561087]
We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
arXiv Detail & Related papers (2022-07-18T11:38:32Z) - Data Augmentation techniques in time series domain: A survey and
taxonomy [0.20971479389679332]
Deep neural networks used to work with time series heavily depend on the size and consistency of the datasets used in training.
This work systematically reviews the current state-of-the-art in the area to provide an overview of all available algorithms.
The ultimate aim of this study is to provide a summary of the evolution and performance of areas that produce better results to guide future researchers in this field.
arXiv Detail & Related papers (2022-06-25T17:09:00Z) - Towards Target High-Utility Itemsets [2.824395407508717]
In applied intelligence, utility-driven pattern discovery algorithms can identify insightful and useful patterns in databases.
Targeted high-utility itemset mining has emerged as a key research topic.
We propose THUIM (Targeted High-Utility Itemset Mining), which can quickly match high-utility itemsets during the mining process to select the targeted patterns.
arXiv Detail & Related papers (2022-06-09T18:42:58Z) - Incremental Mining of Frequent Serial Episodes Considering Multiple
Occurrence [11.387440344044315]
One of the fundamental research directions is to mine sequential patterns over data streams.
The pattern over a window of itemsets stream and their multiple occurrences provides additional capability to recognize the essential characteristics of the patterns.
We propose a corresponding efficient sequential miner with novel strategies to prune search space efficiently.
arXiv Detail & Related papers (2022-01-27T17:10:16Z) - Frequent Itemset-driven Search for Finding Minimum Node Separators in
Complex Networks [61.2383572324176]
We propose a frequent itemset-driven search approach, which integrates the concept of frequent itemset mining in data mining into the well-known memetic search framework.
It iteratively employs the frequent itemset recombination operator to generate promising offspring solution based on itemsets that frequently occur in high-quality solutions.
In particular, it discovers 29 new upper bounds and matches 18 previous best-known bounds.
arXiv Detail & Related papers (2022-01-18T11:16:40Z) - Parallel algorithms for mining of frequent itemsets [0.0]
We develop a parallel method for mining of frequent itemsets on a distributed memory parallel computer.
Our method achieves speedup of 6 on 10 processors.
arXiv Detail & Related papers (2021-08-11T05:22:52Z) - Multi-source Data Mining for e-Learning [3.8673630752805432]
Pattern mining involves extracting interesting frequent patterns from data.
With the increase in the amount of data, multi-source and heterogeneous data has become an emerging challenge in this domain.
This challenge is the main focus of our work where we propose to mine multi-source data in order to extract interesting frequent patterns.
arXiv Detail & Related papers (2020-09-17T15:39:45Z) - Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.
We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem.
We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.