Implementing Streaming algorithm and k-means clusters to RAG
- URL: http://arxiv.org/abs/2407.21300v3
- Date: Sat, 7 Sep 2024 15:02:48 GMT
- Title: Implementing Streaming algorithm and k-means clusters to RAG
- Authors: Haoyu Kang, Yuzhou Zhu, Yukun Zhong, Ke Wang,
- Abstract summary: We propose a new approach integrating a streaming algorithm with k-means clustering into RAG.
Our approach applied a streaming algorithm to update the index dynamically and reduce memory consumption.
We conducted comparative experiments on four methods, and the results indicated that RAG with streaming algorithm and k-means clusters outperforms traditional RAG in accuracy and memory.
- Score: 2.5251537417183028
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-augmented generation (RAG) has achieved significant success in information retrieval to assist large language models LLMs because it builds an external knowledge database. However, it also has many problems, it consumes a lot of memory because of the enormous database, and it cannot update the established index database in time when confronted with massive streaming data. To reduce the memory required for building the database and maintain accuracy simultaneously, we proposed a new approach integrating a streaming algorithm with k-means clustering into RAG. Our approach applied a streaming algorithm to update the index dynamically and reduce memory consumption. Additionally, the k-means algorithm clusters highly similar documents, and the query time would be shortened. We conducted comparative experiments on four methods, and the results indicated that RAG with streaming algorithm and k-means clusters outperforms traditional RAG in accuracy and memory, particularly when dealing with large-scale streaming data.
Related papers
- An Enhanced Model-based Approach for Short Text Clustering [58.60681789677676]
Short text clustering has become increasingly important with the popularity of social media like Twitter, Google+, and Facebook.<n>Existing methods can be broadly categorized into two paradigms: topic model-based approaches and deep representation learning-based approaches.<n>We propose a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM), which effectively handles the sparsity and high dimensionality of short texts.<n>Based on several aspects of GSDMM that warrant further refinement, we propose an improved approach, GSDMM+, designed to further optimize its performance.
arXiv Detail & Related papers (2025-07-18T10:07:42Z) - Log-Augmented Generation: Scaling Test-Time Reasoning with Reusable Computation [80.69067017594709]
Large language models (LLMs) and their agentic counterparts struggle to retain reasoning from previous tasks.<n>We propose a novel framework, log-augmented generation (LAG) that directly reuses prior computation and reasoning from past logs at test time.<n>Our method significantly outperforms standard agentic systems that do not utilize logs.
arXiv Detail & Related papers (2025-05-20T14:14:38Z) - Leveraging Approximate Caching for Faster Retrieval-Augmented Generation [1.3450852784287828]
Retrieval-augmented generation (RAG) enhances the reliability of large language model (LLM) answers by integrating external knowledge.
RAG increases the end-to-end inference time since looking for relevant documents from large vector databases is computationally expensive.
We introduce Proximity, an approximate key-value cache that optimize the RAG workflow by leveraging similarities in user queries.
arXiv Detail & Related papers (2025-03-07T15:54:04Z) - Investigating the Scalability of Approximate Sparse Retrieval Algorithms to Massive Datasets [8.1990111961557]
We investigate the behavior of state-of-the-art retrieval algorithms on massive datasets.<n>We compare and contrast the recently-proposed Seismic and graph-based solutions adapted from dense retrieval.<n>We extensively evaluate Splade embeddings of 138M passages from MsMarco-v2 and report indexing time and other efficiency and effectiveness metrics.
arXiv Detail & Related papers (2025-01-20T17:59:21Z) - Efficient Long Context Language Model Retrieval with Compression [57.09163579304332]
Long Context Language Models (LCLMs) have emerged as a new paradigm to perform Information Retrieval (IR)<n>We propose a new compression approach tailored for LCLM retrieval, which is trained to maximize the retrieval performance while minimizing the length of the compressed passages.<n>We show that CoLoR improves the retrieval performance by 6% while compressing the in-context size by a factor of 1.91.
arXiv Detail & Related papers (2024-12-24T07:30:55Z) - SALSA-CLRS: A Sparse and Scalable Benchmark for Algorithmic Reasoning [20.706469085872516]
We introduce an extension to the CLRS algorithmic learning benchmark, prioritizing scalability and the utilization of sparse representations.
Our approach includes adapted algorithms from the original CLRS benchmark and introduces new problems from distributed and randomized algorithms.
arXiv Detail & Related papers (2023-09-21T16:57:09Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Influence of Swarm Intelligence in Data Clustering Mechanisms [0.0]
Nature inspired Swarm based algorithms are used for data clustering to cope with larger datasets with lack and inconsistency of data.
This paper reviews the performances of these new approaches and compares which is best for certain problematic situation.
arXiv Detail & Related papers (2023-05-07T08:40:50Z) - How to Use K-means for Big Data Clustering? [2.1165011830664677]
K-means is the simplest and most widely used algorithm under the Euclidean Minimum Sum-of-Squares Clustering (MSSC) model.
We propose a new parallel scheme of using K-means and K-means++ algorithms for big data clustering.
arXiv Detail & Related papers (2022-04-14T08:18:01Z) - Asynchronous Parallel Incremental Block-Coordinate Descent for
Decentralized Machine Learning [55.198301429316125]
Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing.
For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data.
This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices.
arXiv Detail & Related papers (2022-02-07T15:04:15Z) - Improved Multi-objective Data Stream Clustering with Time and Memory
Optimization [0.0]
This paper introduces a new data stream clustering method (IMOC-Stream)
It uses two different objective functions to capture different aspects of the data.
The experiments show the ability of our method to partition the data stream in arbitrarily shaped, compact, and well-separated clusters.
arXiv Detail & Related papers (2022-01-13T17:05:56Z) - Meta Clustering Learning for Large-scale Unsupervised Person
Re-identification [124.54749810371986]
We propose a "small data for big task" paradigm dubbed Meta Clustering Learning (MCL)
MCL only pseudo-labels a subset of the entire unlabeled data via clustering to save computing for the first-phase training.
Our method significantly saves computational cost while achieving a comparable or even better performance compared to prior works.
arXiv Detail & Related papers (2021-11-19T04:10:18Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Structured Inverted-File k-Means Clustering for High-Dimensional Sparse
Data [2.487445341407889]
This paper presents an architecture-friendly k-means clustering algorithm called SIVF for a large-scale and high-dimensional sparse data set.
Our performance analysis reveals that SIVF achieves the higher speed by suppressing performance degradation factors of the number of cache misses and branch mispredictions.
arXiv Detail & Related papers (2021-03-30T07:54:02Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.