Related papers: Improved Multi-objective Data Stream Clustering with Time and Memory Optimization

Improved Multi-objective Data Stream Clustering with Time and Memory Optimization

URL: http://arxiv.org/abs/2201.05079v1
Date: Thu, 13 Jan 2022 17:05:56 GMT
Title: Improved Multi-objective Data Stream Clustering with Time and Memory Optimization
Authors: Mohammed Oualid Attaoui, Hanene Azzag, Mustapha Lebbah, and Nabil Keskes
Abstract summary: This paper introduces a new data stream clustering method (IMOC-Stream) It uses two different objective functions to capture different aspects of the data. The experiments show the ability of our method to partition the data stream in arbitrarily shaped, compact, and well-separated clusters.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The analysis of data streams has received considerable attention over the past few decades due to sensors, social media, etc. It aims to recognize patterns in an unordered, infinite, and evolving stream of observations. Clustering this type of data requires some restrictions in time and memory. This paper introduces a new data stream clustering method (IMOC-Stream). This method, unlike the other clustering algorithms, uses two different objective functions to capture different aspects of the data. The goal of IMOC-Stream is to: 1) reduce computation time by using idle times to apply genetic operations and enhance the solution. 2) reduce memory allocation by introducing a new tree synopsis. 3) find arbitrarily shaped clusters by using a multi-objective framework. We conducted an experimental study with high dimensional stream datasets and compared them to well-known stream clustering techniques. The experiments show the ability of our method to partition the data stream in arbitrarily shaped, compact, and well-separated clusters while optimizing the time and memory. Our method also outperformed most of the stream algorithms in terms of NMI and ARAND measures.

Related papers

TNStream: Applying Tightest Neighbors to Micro-Clusters to Define Multi-Density Clusters in Streaming Data [1.2016321065590192]
This paper proposes a clustering algorithm based on the novel concept of Tightest Neighbors and introduces a data stream clustering theory based on the Skeleton Set. Based on these theories, this paper develops a new method, TNStream, a fully online algorithm. Experimental results demonstrate its effectiveness in improving clustering quality for multi-density data and validate the proposed data stream clustering theory.
arXiv Detail & Related papers (2025-05-01T07:15:20Z)
Implementing Streaming algorithm and k-means clusters to RAG [2.5251537417183028]
We propose a new approach integrating a streaming algorithm with k-means clustering into RAG. Our approach applied a streaming algorithm to update the index dynamically and reduce memory consumption. We conducted comparative experiments on four methods, and the results indicated that RAG with streaming algorithm and k-means clusters outperforms traditional RAG in accuracy and memory.
arXiv Detail & Related papers (2024-07-31T03:00:59Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Contrastive Continual Multi-view Clustering with Filtered Structural Fusion [57.193645780552565]
Multi-view clustering thrives in applications where views are collected in advance. It overlooks scenarios where data views are collected sequentially, i.e., real-time data. Some methods are proposed to handle it but are trapped in a stability-plasticity dilemma. We propose Contrastive Continual Multi-view Clustering with Filtered Structural Fusion.
arXiv Detail & Related papers (2023-09-26T14:18:29Z)
Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology [0.0]
Time-series clustering serves as a powerful data mining technique for time-series data in the absence of prior knowledge about clusters. This study proposes a novel time-series clustering method that leverages an annealing machine.
arXiv Detail & Related papers (2023-05-26T05:58:14Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
Clustering Plotted Data by Image Segmentation [12.443102864446223]
Clustering algorithms are one of the main analytical methods to detect patterns in unlabeled data. In this paper, we present a wholly different way of clustering points in 2-dimensional space, inspired by how humans cluster data. Our approach, Visual Clustering, has several advantages over traditional clustering algorithms.
arXiv Detail & Related papers (2021-10-06T06:19:30Z)
SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory. We propose StreaMRAK - a streaming version of KRR. We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z)
Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation [68.45737688496654]
We establish correspondences directly between frames without re-encoding the mask features for every object. With the correspondences, every node in the current query frame is inferred by aggregating features from the past in an associative fashion. We validated that every memory node now has a chance to contribute, and experimentally showed that such diversified voting is beneficial to both memory efficiency and inference accuracy.
arXiv Detail & Related papers (2021-06-09T16:50:57Z)
Data Stream Clustering: A Review [0.0]
Clustering is one of the most suitable methods for real-time data stream processing. We review recent data stream clustering algorithms and analyze them in terms of the base clustering technique, computational complexity and clustering accuracy. We indicate popular data stream repositories and datasets, stream processing tools and platforms.
arXiv Detail & Related papers (2020-07-16T20:35:09Z)
Learnable Subspace Clustering [76.2352740039615]
We develop a learnable subspace clustering paradigm to efficiently solve the large-scale subspace clustering problem. The key idea is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces. To the best of our knowledge, this paper is the first work to efficiently cluster millions of data points among the subspace clustering methods.
arXiv Detail & Related papers (2020-04-09T12:53:28Z)
A Novel Incremental Clustering Technique with Concept Drift Detection [2.790947019327459]
Traditional static clustering algorithms are not suitable for dynamic datasets. We propose an efficient incremental clustering algorithm called UIClust. We evaluate the performance of UIClust by comparing it with a recently published, high-quality incremental clustering algorithm.
arXiv Detail & Related papers (2020-03-30T05:20:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.