Motiflets -- Simple and Accurate Detection of Motifs in Time Series
- URL: http://arxiv.org/abs/2206.03735v2
- Date: Tue, 16 Apr 2024 20:38:33 GMT
- Title: Motiflets -- Simple and Accurate Detection of Motifs in Time Series
- Authors: Patrick Schäfer, Ulf Leser,
- Abstract summary: A time series motif intuitively is a short time series that repeats itself approximately the same within a larger time series.
Motif discovery (MD) is the task of finding such motifs in a given input series.
We present exact and approximate algorithms for finding k-Motiflets and analyze their complexity.
- Score: 3.6463708995502273
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A time series motif intuitively is a short time series that repeats itself approximately the same within a larger time series. Such motifs often represent concealed structures, such as heart beats in an ECG recording, the riff in a pop song, or sleep spindles in EEG sleep data. Motif discovery (MD) is the task of finding such motifs in a given input series. As there are varying definitions of what exactly a motif is, a number of different algorithms exist. As central parameters they all take the length l of the motif and the maximal distance r between the motif's occurrences. In practice, however, especially suitable values for r are very hard to determine upfront, and found motifs show a high variability even for very similar r values. Accordingly, finding an interesting motif requires extensive trial-and-error. In this paper, we present a different approach to the MD problem. We define k-Motiflets as the set of exactly k occurrences of a motif of length l, whose maximum pairwise distance is minimal. This turns the MD problem upside-down: The central parameter of our approach is not the distance threshold r, but the desired number of occurrence k of the motif, which we show is considerably more intuitive and easier to set. Based on this definition, we present exact and approximate algorithms for finding k-Motiflets and analyze their complexity. To further ease the use of our method, we describe statistical tools to automatically determine meaningful values for its input parameters. By evaluation on several real-world data sets and comparison to four SotA MD algorithms, we show that our proposed algorithm is both quantitatively superior to its competitors, finding larger motif sets at higher similarity, and qualitatively better, leading to clearer and easier to interpret motifs without any need for manual tuning.
Related papers
- TIMEST: Temporal Information Motif Estimator Using Sampling Trees [5.114632024458956]
We present TIMEST: a general, fast, and accurate estimation algorithm to count temporal motifs of arbitrary sizes in temporal networks.<n>We show that TIMEST is both faster and more accurate than previous algorithms.<n>For example, TIMEST can count the number of instances of a financial temporal motif in four minutes with 0.6% error, while exact methods take more than two days.
arXiv Detail & Related papers (2025-07-27T23:31:55Z) - Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute [57.16286134405821]
We propose Fractional Reasoning, a framework that enables continuous control over reasoning intensity at inference time.<n>Our method operates by extracting the latent steering vector associated with deeper reasoning and reapplying it with a tunable scaling factor.<n> Experiments on GSM8K, MATH500, and GPQA demonstrate that Fractional Reasoning consistently improves performance across diverse reasoning tasks and models.
arXiv Detail & Related papers (2025-06-18T21:15:59Z) - Critical Thinking: Which Kinds of Complexity Govern Optimal Reasoning Length? [72.70486097967124]
We formalize a framework using deterministic finite automata (DFAs)
We show that there exists an optimal amount of reasoning tokens such that the probability of producing a correct solution is maximized.
We then demonstrate an implication of these findings: being able to predict the optimal number of reasoning tokens for new problems and filtering out non-optimal length answers results in consistent accuracy improvements.
arXiv Detail & Related papers (2025-04-02T17:45:58Z) - A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs [50.982315553104975]
Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms.
Most SMMs are manually built by human experts using bottom-up procedures.
We propose a novel graph-based algorithm that automatically generates conceptual spaces and SMMs in a top-down manner.
arXiv Detail & Related papers (2024-12-02T12:06:41Z) - Discovering Leitmotifs in Multidimensional Time Series [3.6463708995502273]
We present the novel, efficient and highly effective leitmotif discovery algorithm LAMA for MDTS.
LAMA rests on two core principals: (a) a leitmotif manifests solely given a yet unknown number of sub-dimensions - neither too few, nor too many, and (b) the set of sub-dimensions are not independent from the best pattern found.
arXiv Detail & Related papers (2024-10-16T06:50:45Z) - Representation Learning for Frequent Subgraph Mining [64.32430554934021]
Subgraph Pattern Miner (SPMiner) is a novel neural approach for finding frequent subgraphs in a large target graph.
For 5- and 6-node motifs, SPMiner can almost perfectly identify the most frequent motifs while being 100x faster than exact enumeration methods.
SPMiner can also reliably identify frequent 10-node motifs, which is well beyond the size limit of exact enumeration approaches.
arXiv Detail & Related papers (2024-02-22T08:11:22Z) - LoCoMotif: Discovering time-warped motifs in time series [7.265353600305124]
Time Series Motif Discovery (TSMD) refers to the task of identifying patterns that occur multiple times in a time series.
Existing methods for TSMD have one or more of the following limitations.
We present a new method, LoCoMotif, that has none of these limitations.
arXiv Detail & Related papers (2023-11-29T12:18:46Z) - Complexity-Based Prompting for Multi-Step Reasoning [72.0057198610614]
We study the task of prompting large-scale language models to perform multi-step reasoning.
A central question is which reasoning examples make the most effective prompts.
We propose complexity-based prompting, a simple and effective example selection scheme for multi-step reasoning.
arXiv Detail & Related papers (2022-10-03T05:33:27Z) - Multi-scale Anomaly Detection for Big Time Series of Industrial Sensors [50.6434162489902]
We propose a reconstruction-based anomaly detection method, MissGAN, iteratively learning to decode and encode naturally smooth time series.
MissGAN does not need labels or only needs labels of normal instances, making it widely applicable.
arXiv Detail & Related papers (2022-04-18T04:34:15Z) - OPP-Miner: Order-preserving sequential pattern mining [26.997138010841347]
This paper proposes an Order-Preserving sequential Pattern (OPP) mining method, which represents patterns based on the order relationships of the time series data.
Experiments validate that OPP-Miner is not only efficient and scalable but can also discover similar sub-sequences in time series.
arXiv Detail & Related papers (2022-01-09T11:06:26Z) - odeN: Simultaneous Approximation of Multiple Motif Counts in Large
Temporal Networks [7.025709586759655]
One of the main complications in studying temporal motifs is the large number of motifs that can be built.
We propose odeN, a sampling-based algorithm that provides an accurate approximation of all the counts of the motifs.
arXiv Detail & Related papers (2021-08-19T15:06:05Z) - Determinantal Beam Search [75.84501052642361]
Beam search is a go-to strategy for decoding neural sequence models.
In use-cases that call for multiple solutions, a diverse or representative set is often desired.
By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
arXiv Detail & Related papers (2021-06-14T13:01:46Z) - When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable.
In order to achieve a better accuracy, we propose two lightweight modules.
DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers.
QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z) - Finding Geometric Models by Clustering in the Consensus Space [61.65661010039768]
We propose a new algorithm for finding an unknown number of geometric models, e.g., homographies.
We present a number of applications where the use of multiple geometric models improves accuracy.
These include pose estimation from multiple generalized homographies; trajectory estimation of fast-moving objects.
arXiv Detail & Related papers (2021-03-25T14:35:07Z) - Motif Difference Field: A Simple and Effective Image Representation of
Time Series for Classification [3.419406971620478]
The motif-based time series clustering is used for the discovery of higher-order patterns or structures in time series data.
Inspired by the convolutional neural network (CNN) classifier based on the image representations of time series, motif difference field (MDF) is proposed.
arXiv Detail & Related papers (2020-01-21T14:48:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.