Related papers: Efficient $k$-NN Search in IoT Data: Overlap Optimization in Tree-Based Indexing Structures

Efficient $k$-NN Search in IoT Data: Overlap Optimization in Tree-Based Indexing Structures

URL: http://arxiv.org/abs/2408.16036v1
Date: Wed, 28 Aug 2024 16:16:55 GMT
Title: Efficient $k$-NN Search in IoT Data: Overlap Optimization in Tree-Based Indexing Structures
Authors: Ala-Eddine Benrazek, Zineddine Kouahla, Brahim Farou, Hamid Seridi, Ibtissem Kemouguette,
Abstract summary: The proliferation of interconnected devices in the Internet of Things (IoT) has led to an exponential increase in data. Efficient retrieval of this heterogeneous data demands a robust indexing mechanism for effective organization. We propose three innovatives designed to quantify and strategically reduce data space partition overlap.
Score: 0.6990493129893112
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The proliferation of interconnected devices in the Internet of Things (IoT) has led to an exponential increase in data, commonly known as Big IoT Data. Efficient retrieval of this heterogeneous data demands a robust indexing mechanism for effective organization. However, a significant challenge remains: the overlap in data space partitions during index construction. This overlap increases node access during search and retrieval, resulting in higher resource consumption, performance bottlenecks, and impedes system scalability. To address this issue, we propose three innovative heuristics designed to quantify and strategically reduce data space partition overlap. The volume-based method (VBM) offers a detailed assessment by calculating the intersection volume between partitions, providing deeper insights into spatial relationships. The distance-based method (DBM) enhances efficiency by using the distance between partition centers and radii to evaluate overlap, offering a streamlined yet accurate approach. Finally, the object-based method (OBM) provides a practical solution by counting objects across multiple partitions, delivering an intuitive understanding of data space dynamics. Experimental results demonstrate the effectiveness of these methods in reducing search time, underscoring their potential to improve data space partitioning and enhance overall system performance.

Related papers

Edge-device Collaborative Computing for Multi-view Classification [9.047284788663776]
We explore collaborative inference at the edge, in which edge nodes and end devices share correlated data and the inference computational burden. We introduce selective schemes that decrease bandwidth resource consumption by effectively reducing data redundancy. Experimental results highlight that selective collaborative schemes can achieve different trade-offs between the above performance metrics.
arXiv Detail & Related papers (2024-09-24T11:07:33Z)
Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation [51.44054828384487]
We propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD) This method systematically explores hierarchical layers within the generative adversarial networks (GANs) In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation.
arXiv Detail & Related papers (2024-06-09T09:15:54Z)
Efficient Architecture Search via Bi-level Data Pruning [70.29970746807882]
This work pioneers an exploration into the critical role of dataset characteristics for DARTS bi-level optimization. We introduce a new progressive data pruning strategy that utilizes supernet prediction dynamics as the metric. Comprehensive evaluations on the NAS-Bench-201 search space, DARTS search space, and MobileNet-like search space validate that BDP reduces search costs by over 50%.
arXiv Detail & Related papers (2023-12-21T02:48:44Z)
PREM: A Simple Yet Effective Approach for Node-Level Graph Anomaly Detection [65.24854366973794]
Node-level graph anomaly detection (GAD) plays a critical role in identifying anomalous nodes from graph-structured data in domains such as medicine, social networks, and e-commerce. We introduce a simple method termed PREprocessing and Matching (PREM for short) to improve the efficiency of GAD. Our approach streamlines GAD, reducing time and memory consumption while maintaining powerful anomaly detection capabilities.
arXiv Detail & Related papers (2023-10-18T02:59:57Z)
Learned spatial data partitioning [7.342228103959199]
We first study learned spatial data partitioning, which effectively assigns groups of big spatial data to computers based on locations of data. We formalize spatial data partitioning in the context of reinforcement learning and develop a novel deep reinforcement learning algorithm. Our method efficiently finds partitions for accelerating distance join queries and reduces the workload run time by up to 59.4%.
arXiv Detail & Related papers (2023-06-08T00:42:10Z)
Towards Sequence Utility Maximization under Utility Occupancy Measure [53.234101208024335]
In the database, although utility is a flexible criterion for each pattern, it is a more absolute criterion due to neglect of utility sharing. We first define utility occupancy on sequence data and raise the problem of High Utility-Occupancy Sequential Pattern Mining. An algorithm called Sequence Utility Maximization with Utility occupancy measure (SUMU) is proposed.
arXiv Detail & Related papers (2022-12-20T17:28:53Z)
UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed. The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features. Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z)
Segmentation-guided Domain Adaptation for Efficient Depth Completion [3.441021278275805]
We propose an efficient depth completion model based on a vgg05-like CNN architecture and a semi-supervised domain adaptation approach. In order to boost spatial coherence, we guide the learning process using segmentations as additional source of information. Our approach improves on previous efficient and low parameter state of the art approaches while having a noticeably lower computational footprint.
arXiv Detail & Related papers (2022-10-14T13:01:25Z)
Efficient Joint-Dimensional Search with Solution Space Regularization for Real-Time Semantic Segmentation [27.94898516315886]
We search an optimal network structure that can run in real-time for this problem. A novel Solution Space Regularization (SSR) loss is first proposed to effectively encourage the supernet to converge to its discrete one. A new Hierarchical and Progressive Solution Space Shrinking method is presented to further achieve high efficiency of searching.
arXiv Detail & Related papers (2022-08-10T11:07:33Z)
DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference. It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity. Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z)
Spatial-Spectral Residual Network for Hyperspectral Image Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet) Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information. In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.