Related papers: Bullion: A Column Store for Machine Learning

Bullion: A Column Store for Machine Learning

URL: http://arxiv.org/abs/2404.08901v3
Date: Tue, 19 Nov 2024 07:04:06 GMT
Title: Bullion: A Column Store for Machine Learning
Authors: Gang Liao, Ye Liu, Jianjun Chen, Daniel J. Abadi,
Abstract summary: This paper presents Bullion, a columnar storage system tailored for machine learning workloads. Bundy addresses the complexities of data compliance, optimize the encoding of long sequence sparse features, efficiently manages wide-table projections, introduces feature quantization in storage, and provides a comprehensive cascading encoding framework. Preliminary experimental results and theoretical analysis demonstrate Bullion's improved ability to deliver strong performance in the face of the unique demands of machine learning workloads.
Score: 4.096087402737292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The past two decades have witnessed significant success in applying columnar storage to data warehousing and analytics. However, the rapid growth of machine learning poses new challenges. This paper presents Bullion, a columnar storage system tailored for machine learning workloads. Bullion addresses the complexities of data compliance, optimizes the encoding of long sequence sparse features, efficiently manages wide-table projections, introduces feature quantization in storage, enables quality-aware sequential reads for multimodal training data, and provides a comprehensive cascading encoding framework that unifies diverse encoding schemes through modular, composable interfaces. By aligning with the evolving requirements of ML applications, Bullion facilitates the application of columnar storage and processing to modern application scenarios such as those within advertising, recommendation systems, and Generative AI. Preliminary experimental results and theoretical analysis demonstrate Bullion's improved ability to deliver strong performance in the face of the unique demands of machine learning workloads compared to existing columnar storage solutions. Bullion significantly reduces I/O costs for deletion compliance, achieves substantial storage savings with its optimized encoding scheme for sparse features, and improves metadata parsing speed for wide-table projections. These advancements enable Bullion to become an important component in the future of machine learning infrastructure, enabling organizations to efficiently manage and process the massive volumes of data required for training and inference in modern AI applications.

Related papers

Operationalization of Machine Learning with Serverless Architecture: An Industrial Operationalization of Machine Learning with Serverless Architecture: An Industrial Implementation for Harmonized System Code Prediction [0.0]
This paper presents a serverless MLOps framework orchestrating the complete ML lifecycle from data ingestion, training, deployment, monitoring, and retraining to using event-driven pipelines and managed services.<n>We demonstrate practical applicability through an industrial implementation for Harmonized System (HS) code prediction, a compliance-critical task where short, unstructured product descriptions are mapped to standardized codes used by customs authorities in global trade.<n>Our solution uses a custom text embedding multiple deep learning architectures, with Text-CNN achieving 98 percent accuracy on ground truth data.
arXiv Detail & Related papers (2026-02-19T05:59:55Z)
Scaling LLM Speculative Decoding: Non-Autoregressive Forecasting in Large-Batch Scenarios [76.85739138203014]
We present SpecFormer, a novel architecture that accelerates unidirectional and attention mechanisms.<n>We demonstrate that SpecFormer achieves lower training demands and reduced computational costs.
arXiv Detail & Related papers (2025-11-25T14:20:08Z)
Federated Learning with Ad-hoc Adapter Insertions: The Case of Soft-Embeddings for Training Classifier-as-Retriever [0.6524460254566904]
We propose a novel encoder in which token embeddings of a new corpus learn to produce enhanced soft embeddings for it, while requiring less compute power to update than full finetuning.<n>We adopt federated learning (FL) and differential privacy (DP) to achieve an efficient, privacy-constrained, and product-grade training solution.
arXiv Detail & Related papers (2025-09-20T03:07:03Z)
Compressive Meta-Learning [49.300635370079874]
Compressive learning is a framework that enables efficient processing by using random, non-linear features.<n>We propose a framework that meta-learns both the encoding and decoding stages of compressive learning methods.<n>We explore multiple applications -- including neural network-based compressive PCA, compressive ridge regression, compressive k-means, and autoencoders.
arXiv Detail & Related papers (2025-08-14T22:08:06Z)
GENIAL: Generative Design Space Exploration via Network Inversion for Low Power Algorithmic Logic Units [1.5845117761091052]
We introduce GENIAL, a machine learning-based framework for the automatic generation and optimization of arithmetic units.<n>We show that GENIAL is consistently more sample efficient than other methods, and converges faster towards optimized designs.<n>We also demonstrate the versatility of our approach by achieving significant improvements on Finite State Machines.
arXiv Detail & Related papers (2025-07-25T06:34:59Z)
dreaMLearning: Data Compression Assisted Machine Learning [12.875773186735396]
This paper introduces dreaMLearning, a framework that enables learning from compressed data without decompression.<n>Experiments demonstrate that dreaMLearning accelerates training by up to 8.8x, reduces memory usage by 10x, and cuts storage by 42%.<n>These advancements enhance diverse ML applications, including distributed and federated learning, and tinyML on resource-constrained edge devices.
arXiv Detail & Related papers (2025-06-27T12:57:22Z)
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding [0.0]
We present a framework for enhancing Retrieval-Augmented Generation (RAG) systems through dynamic retrieval strategies and reinforcement fine-tuning. Our framework integrates two complementary techniques: Policy-d RetrievalAugmented Generation (PORAG) and Adaptive Token-Layer Attention Scoring (ATLAS) Our framework reduces hallucinations, strengthens domain-specific reasoning, and achieves significant efficiency and scalability gains over traditional RAG systems.
arXiv Detail & Related papers (2025-04-02T01:16:10Z)
Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance. Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws. We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
A Universal Framework for Compressing Embeddings in CTR Prediction [68.27582084015044]
We introduce a Model-agnostic Embedding Compression (MEC) framework that compresses embedding tables by quantizing pre-trained embeddings. Our approach consists of two stages: first, we apply popularity-weighted regularization to balance code distribution between high- and low-frequency features. Experiments on three datasets reveal that our method reduces memory usage by over 50x while maintaining or improving recommendation performance.
arXiv Detail & Related papers (2025-02-21T10:12:34Z)
Dynamic Optimization of Storage Systems Using Reinforcement Learning Techniques [40.13303683102544]
This paper introduces RL-Storage, a reinforcement learning-based framework designed to dynamically optimize storage system configurations. RL-Storage learns from real-time I/O patterns and predicts optimal storage parameters, such as cache size, queue depths, and readahead settings. It achieves throughput gains of up to 2.6x and latency reductions of 43% compared to baselines.
arXiv Detail & Related papers (2024-12-29T17:41:40Z)
Dynamic Adaptation in Data Storage: Real-Time Machine Learning for Enhanced Prefetching [40.13303683102544]
This study explores the application of streaming machine learning to revolutionize data prefetching within multi-tiered storage systems. Unlike traditional batch-trained models, streaming machine learning offers adaptability, real-time insights, and computational efficiency.
arXiv Detail & Related papers (2024-12-29T17:39:37Z)
Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning [0.0]
patio-temporal forecasting plays a crucial role in various sectors such as transportation systems, logistics, and supply chain management. We introduce a hybrid approach that combines the strengths of open-source large and small-scale language models (LLMs and LMs) with traditional forecasting methods.
arXiv Detail & Related papers (2024-08-26T16:11:53Z)
MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models [13.839564855350295]
This paper addresses the importance of converting real-time task experiences into system memory. We show that the accumulation and recycling of task memories lead to a steady enhancement in task success rate. The inclusion of memory recycling can also boost the system's task execution efficiency by up to 25%.
arXiv Detail & Related papers (2024-08-07T15:27:22Z)
Online Adaptation of Language Models with a Memory of Amortized Contexts [82.02369596879817]
Memory of Amortized Contexts (MAC) is an efficient and effective online adaptation framework for large language models. We show how MAC can be combined with and improve the performance of popular alternatives such as retrieval augmented generations.
arXiv Detail & Related papers (2024-03-07T08:34:57Z)
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime. We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z)
In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z)
Retrieval-Enhanced Machine Learning [110.5237983180089]
We describe a generic retrieval-enhanced machine learning framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-05-02T21:42:45Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Scalable Deep-Learning-Accelerated Topology Optimization for Additively Manufactured Materials [4.221095652322005]
Topology optimization (TO) is a popular and powerful computational approach for designing novel structures, materials, and devices. To address these issues, we propose a general scalable deep-learning (DL) based TO framework, referred to as SDL-TO. Our framework accelerates TO by learning the iterative history data and simultaneously training on the mapping between the given design and its gradient.
arXiv Detail & Related papers (2020-11-28T17:38:31Z)
Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights [18.04657939198617]
This paper provides a comprehensive survey on the efficient execution of sparse and irregular tensor computations of machine learning models on hardware accelerators. It analyzes different hardware designs and acceleration techniques and analyzes them in terms of hardware and execution costs. The takeaways from this paper include: understanding the key challenges in accelerating sparse, irregular-shaped, and quantized tensors.
arXiv Detail & Related papers (2020-07-02T04:08:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.