Related papers: Client Error Clustering Approaches in Content Delivery Networks (CDN)

Client Error Clustering Approaches in Content Delivery Networks (CDN)

URL: http://arxiv.org/abs/2210.05314v1
Date: Tue, 11 Oct 2022 10:14:07 GMT
Title: Client Error Clustering Approaches in Content Delivery Networks (CDN)
Authors: Ermiyas Birihanu, Jiyan Mahmud, P\'eter Kiss, Adolf Kamuzora, Wadie Skaf, Tom\'a\v{s} Horv\'ath, Tam\'as Jursonovics, Peter Pogrzeba and Imre Lend\'ak
Abstract summary: CDN operators face a significant challenge when analyzing billions of web server and proxy logs generated by their systems. This study was to analyze the applicability of various clustering methods in CDN error log analysis. Our experiments were run on a dataset consisting of proxy logs collected over a 7-day period from a single, physical CDN server.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Content delivery networks (CDNs) are the backbone of the Internet and are key in delivering high quality video on demand (VoD), web content and file services to billions of users. CDNs usually consist of hierarchically organized content servers positioned as close to the customers as possible. CDN operators face a significant challenge when analyzing billions of web server and proxy logs generated by their systems. The main objective of this study was to analyze the applicability of various clustering methods in CDN error log analysis. We worked with real-life CDN proxy logs, identified key features included in the logs (e.g., content type, HTTP status code, time-of-day, host) and clustered the log lines corresponding to different host types offering live TV, video on demand, file caching and web content. Our experiments were run on a dataset consisting of proxy logs collected over a 7-day period from a single, physical CDN server running multiple types of services (VoD, live TV, file). The dataset consisted of 2.2 billion log lines. Our analysis showed that CDN error clustering is a viable approach towards identifying recurring errors and improving overall quality of service.

Related papers

Detecting Distributed Denial of Service Attacks Using Logistic Regression and SVM Methods [0.0]
The goal of this paper is to detect DDoS attacks from all service requests and classify them according to DDoS classes. Two (2) different machine learning approaches, SVM and Logistic Regression, are implemented in the dataset for detecting and classifying DDoS attacks. Logistic Regression and SVM both achieve 98.65% classification accuracy which is the highest achieved accuracy among other previous experiments with the same dataset.
arXiv Detail & Related papers (2024-11-21T13:15:26Z)
Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis [7.795761092358769]
We introduce VisQUIC, a publicly available dataset of over 100,000 labeled QUIC traces with corresponding SSL keys. By generating visual representations of the traces, we facilitate advanced machine learning (ML) applications and in-depth analysis of encrypted QUIC traffic. Our dataset enables comprehensive studies on QUIC and HTTP/3 protocols and supports the development of tools for encrypted traffic analysis.
arXiv Detail & Related papers (2024-09-30T10:50:12Z)
Unveiling the Bandwidth Nightmare: CDN Compression Format Conversion Attacks [20.374230089231766]
We present a novel HTTP amplification attack, CDN Compression Format Convert (CDN-Convet) Attacks. It allows attackers to massively exhaust not only the outgoing bandwidth of the origin servers deployed behind CDNs but also the bandwidth of CDN surrogate nodes. We examined the CDN-Convet attacks on 11 popular CDNs to evaluate the feasibility and real-world impacts.
arXiv Detail & Related papers (2024-09-01T13:03:47Z)
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads [65.47816359465155]
Running multiple deep neural networks (DNNs) in parallel has become an emerging workload in both edge devices. We propose Dysta, a novel scheduler that utilizes both static sparsity patterns and dynamic sparsity information for the sparse multi-DNN scheduling. Our proposed approach outperforms the state-of-the-art methods with up to 10% decrease in latency constraint violation rate and nearly 4X reduction in average normalized turnaround time.
arXiv Detail & Related papers (2023-10-17T09:25:17Z)
Timely Asynchronous Hierarchical Federated Learning: Age of Convergence [59.96266198512243]
We consider an asynchronous hierarchical federated learning setting with a client-edge-cloud framework. The clients exchange the trained parameters with their corresponding edge servers, which update the locally aggregated model. The goal of each client is to converge to the global model, while maintaining timeliness of the clients.
arXiv Detail & Related papers (2023-06-21T17:39:16Z)
DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL) Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z)
Multi-Perspective Content Delivery Networks Security Framework Using Optimized Unsupervised Anomaly Detection [9.102485917295587]
We propose a multi-perspective unsupervised learning framework for anomaly detection in CDNs. In the proposed framework, a multi-perspective feature engineering approach, an optimized unsupervised anomaly detection model, and a multi-perspective validation method, are developed. Experimental results are presented based on the analytics of eight days of real-world CDN log data provided by a major CDN operator.
arXiv Detail & Related papers (2021-07-24T02:43:23Z)
Analyzing Machine Learning Approaches for Online Malware Detection in Cloud [0.0]
We present online malware detection based on process level performance metrics and analyze the effectiveness of different machine learning models. Our analysis conclude that neural network models can most accurately detect the malware that have on the process level features of virtual machines in the cloud.
arXiv Detail & Related papers (2021-05-19T17:28:12Z)
Graph Prototypical Networks for Few-shot Learning on Attributed Networks [72.31180045017835]
We propose a graph meta-learning framework -- Graph Prototypical Networks (GPN) GPN is able to perform textitmeta-learning on an attributed network and derive a highly generalizable model for handling the target classification task.
arXiv Detail & Related papers (2020-06-23T04:13:23Z)
DiagNet: towards a generic, Internet-scale root cause analysis solution [0.0]
We show how different machine learning techniques can be used for Internet-scale root cause analysis. Our solution, DiagNet, adapts concepts from image processing research to handle network and system metrics. We demonstrate promising root cause analysis capabilities, with a recall of 73.9% including causes only being introduced at inference time.
arXiv Detail & Related papers (2020-04-07T13:21:32Z)
Rethinking Object Detection in Retail Stores [55.359582952686175]
We propose a new task, simultaneously object localization and counting, abbreviated as Locount. Locount requires algorithms to localize groups of objects of interest with the number of instances. We collect a large-scale object localization and counting dataset with rich annotations in retail stores.
arXiv Detail & Related papers (2020-03-18T14:01:54Z)
Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data [78.74367441804183]
We introduce Neural Data Server (NDS), a large-scale search engine for finding the most useful transfer learning data to the target domain. NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client. We show the effectiveness of NDS in various transfer learning scenarios, demonstrating state-of-the-art performance on several target datasets.
arXiv Detail & Related papers (2020-01-09T01:21:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.