TraceMesh: Scalable and Streaming Sampling for Distributed Traces
- URL: http://arxiv.org/abs/2406.06975v1
- Date: Tue, 11 Jun 2024 06:13:58 GMT
- Title: TraceMesh: Scalable and Streaming Sampling for Distributed Traces
- Authors: Zhuangbin Chen, Zhihan Jiang, Yuxin Su, Michael R. Lyu, Zibin Zheng,
- Abstract summary: TraceMesh is a scalable and streaming sampler for distributed traces.
It accommodates previously unseen trace features in a unified and streamlined way.
TraceMesh outperforms state-of-the-art methods by a significant margin in both sampling accuracy and efficiency.
- Score: 51.08892669409318
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Distributed tracing serves as a fundamental element in the monitoring of cloud-based and datacenter systems. It provides visibility into the full lifecycle of a request or operation across multiple services, which is essential for understanding system dependencies and performance bottlenecks. To mitigate computational and storage overheads, most tracing frameworks adopt a uniform sampling strategy, which inevitably captures overlapping and redundant information. More advanced methods employ learning-based approaches to bias the sampling toward more informative traces. However, existing methods fall short of considering the high-dimensional and dynamic nature of trace data, which is essential for the production deployment of trace sampling. To address these practical challenges, in this paper we present TraceMesh, a scalable and streaming sampler for distributed traces. TraceMesh employs Locality-Sensitivity Hashing (LSH) to improve sampling efficiency by projecting traces into a low-dimensional space while preserving their similarity. In this process, TraceMesh accommodates previously unseen trace features in a unified and streamlined way. Subsequently, TraceMesh samples traces through evolving clustering, which dynamically adjusts the sampling decision to avoid over-sampling of recurring traces. The proposed method is evaluated with trace data collected from both open-source microservice benchmarks and production service systems. Experimental results demonstrate that TraceMesh outperforms state-of-the-art methods by a significant margin in both sampling accuracy and efficiency.
Related papers
- Stochastic Localization via Iterative Posterior Sampling [2.1383136715042417]
We consider a general localization framework and introduce an explicit class of observation processes, associated with flexible denoising schedules.
We provide a complete methodology, $textitStochastic localization via Iterative Posterior Sampling$ (SLIPS), to obtain approximate samples of this dynamics, and as a byproduct, samples from the target distribution.
We illustrate the benefits and applicability of SLIPS on several benchmarks of multi-modal distributions, including mixtures in increasing dimensions, logistic regression and high-dimensional field system from statistical-mechanics.
arXiv Detail & Related papers (2024-02-16T15:28:41Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - DAS: Densely-Anchored Sampling for Deep Metric Learning [43.81322638018864]
We propose a Densely-Anchored Sampling (DAS) scheme that exploits the anchor's nearby embedding space to densely produce embeddings without data points.
Our method is effortlessly integrated into existing DML frameworks and improves them without bells and whistles.
arXiv Detail & Related papers (2022-07-30T02:07:46Z) - A Bayesian Detect to Track System for Robust Visual Object Tracking and
Semi-Supervised Model Learning [1.7268829007643391]
We ad-dress problems in a Bayesian tracking and detection framework parameterized by neural network outputs.
We propose a particle filter-based approximate sampling algorithm for tracking object state estimation.
Based on our particle filter inference algorithm, a semi-supervised learn-ing algorithm is utilized for learning tracking network on intermittent labeled frames.
arXiv Detail & Related papers (2022-05-05T00:18:57Z) - An Efficient Anomaly Detection Approach using Cube Sampling with
Streaming Data [2.0515785954568626]
Anomaly detection is critical in various fields, including intrusion detection, health monitoring, fault diagnosis, and sensor network event detection.
The isolation forest (or iForest) approach is a well-known technique for detecting anomalies.
We propose an efficient iForest based approach for anomaly detection using cube sampling that is effective on streaming data.
arXiv Detail & Related papers (2021-10-05T04:23:00Z) - Beyond Farthest Point Sampling in Point-Wise Analysis [52.218037492342546]
We propose a novel data-driven sampler learning strategy for point-wise analysis tasks.
We learn sampling and downstream applications jointly.
Our experiments show that jointly learning of the sampler and task brings remarkable improvement over previous baseline methods.
arXiv Detail & Related papers (2021-07-09T08:08:44Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Learning to Count in the Crowd from Limited Labeled Data [109.2954525909007]
We focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples.
Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data.
arXiv Detail & Related papers (2020-07-07T04:17:01Z) - Cascaded Regression Tracking: Towards Online Hard Distractor
Discrimination [202.2562153608092]
We propose a cascaded regression tracker with two sequential stages.
In the first stage, we filter out abundant easily-identified negative candidates.
In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples.
arXiv Detail & Related papers (2020-06-18T07:48:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.