Related papers: Morphis: SLO-Aware Resource Scheduling for Microservices with Time-Varying Call Graphs

Morphis: SLO-Aware Resource Scheduling for Microservices with Time-Varying Call Graphs

URL: http://arxiv.org/abs/2602.01044v2
Date: Tue, 03 Feb 2026 03:56:21 GMT
Title: Morphis: SLO-Aware Resource Scheduling for Microservices with Time-Varying Call Graphs
Authors: Yu Tang, Hailiang Zhao, Chuansheng Lu, Yifei Zhang, Kingsum Chow, Shuiguang Deng, Rui Shi,
Abstract summary: We propose Morphis, a dependency-aware framework that unifies pattern-aware trace analysis with global optimization.<n>Our evaluations on the TrainTicket benchmark demonstrate that Morphis reduces CPU consumption by 35-38% compared to state-of-the-art baselines.
Score: 26.269214281433364
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Modern microservice systems exhibit continuous structural evolution in their runtime call graphs due to workload fluctuations, fault responses, and deployment activities. Despite this complexity, our analysis of over 500,000 production traces from ByteDance reveals a latent regularity: execution paths concentrate around a small set of recurring invocation patterns. However, existing resource management approaches fail to exploit this structure. Industrial autoscalers like Kubernetes HPA ignore inter-service dependencies, while recent academic methods often assume static topologies, rendering them ineffective under dynamic execution contexts. In this work, we propose Morphis, a dependency-aware provisioning framework that unifies pattern-aware trace analysis with global optimization. It introduces structural fingerprinting that decomposes traces into a stable execution backbone and interpretable deviation subgraphs. Then, resource allocation is formulated as a constrained optimization problem over predicted pattern distributions, jointly minimizing aggregate CPU usage while satisfying end-to-end tail-latency SLOs. Our extensive evaluations on the TrainTicket benchmark demonstrate that Morphis reduces CPU consumption by 35-38% compared to state-of-the-art baselines while maintaining 98.8% SLO compliance.

Related papers

The Curious Case of In-Training Compression of State Space Models [49.819321766705514]
State Space Models (SSMs) tackle long sequence modeling tasks efficiently, offer both parallelizable training and fast inference.<n>Key design challenge is striking the right balance between maximizing expressivity and limiting this computational burden.<n>Our approach, textscCompreSSM, applies to Linear Time-Invariant SSMs such as Linear Recurrent Units, but is also extendable to selective models.
arXiv Detail & Related papers (2025-10-03T09:02:33Z)
Semantic-Aware Scheduling for GPU Clusters with Large Language Models [60.14838697778884]
We propose SchedMate, a framework that bridges the semantic gap between schedulers and jobs they manage.<n>SchedMate extracts deep insights from overlooked, unstructured data sources: source code, runtime logs, and historical jobs.<n>We show SchedMate reduces average job completion times by up to 1.91x, substantially enhancing the scheduling performance.
arXiv Detail & Related papers (2025-10-02T02:01:02Z)
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution [48.7788770680643]
Flash-Searcher is a novel parallel agent reasoning framework.<n>It decomposes complex tasks into subtasks with explicit dependencies, enabling concurrent execution of independent reasoning paths.<n>It achieves 67.7% accuracy on BrowseComp and 83% on xbench-DeepSearch, while reducing agent execution steps by up to 35% compared to current frameworks.
arXiv Detail & Related papers (2025-09-29T17:39:30Z)
CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z)
Learning Unified System Representations for Microservice Tail Latency Prediction [8.532290784939967]
Microservice architectures have become the de facto standard for building scalable cloud-native applications.<n>Traditional approaches often rely on per-request latency metrics, which are highly sensitive to transient noise.<n>We propose USRFNet, a deep learning network that explicitly separates and models traffic-side and resource-side features.
arXiv Detail & Related papers (2025-08-03T07:46:23Z)
RLHGNN: Reinforcement Learning-driven Heterogeneous Graph Neural Network for Next Activity Prediction in Business Processes [14.031370458128068]
Next activity prediction is a challenge for optimizing business processes in service-oriented architectures.<n>We introduce RLHGNN, a novel framework that transforms event logs into heterogeneous process graphs.<n>We show that RLHGNN consistently outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2025-07-03T15:01:08Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
Large Language Models as Realistic Microservice Trace Generators [48.730974361862366]
This paper proposes a first-of-a-kind approach that relies on training a large language model (LLM) to generate synthetic workload traces.<n>We show that TraceLLM produces diverse, realistic traces under varied conditions, outperforming existing approaches in both accuracy and validity.<n>TraceLLM adapts to downstream trace-related tasks, such as predicting key trace features and infilling missing data.
arXiv Detail & Related papers (2024-12-16T12:48:04Z)
Research on the Application of Spark Streaming Real-Time Data Analysis System and large language model Intelligent Agents [1.4582633500696451]
This study explores the integration of Agent AI with LangGraph to enhance real-time data analysis systems in big data environments.<n>The proposed framework overcomes limitations of static, inefficient stateful computations, and lack of human intervention.<n>System architecture incorporates Apache Spark Streaming, Kafka, and LangGraph to create a high-performance sentiment analysis system.
arXiv Detail & Related papers (2024-12-10T05:51:11Z)
A Microservices Identification Method Based on Spectral Clustering for Industrial Legacy Systems [5.255685751491305]
We propose an automated microservice decomposition method for extracting microservice candidates based on spectral graph theory. We show that our method can yield favorable results even without the involvement of domain experts.
arXiv Detail & Related papers (2023-12-20T07:47:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.