Extracting Overlapping Microservices from Monolithic Code via Deep Semantic Embeddings and Graph Neural Network-Based Soft Clustering
- URL: http://arxiv.org/abs/2508.07486v1
- Date: Sun, 10 Aug 2025 21:07:20 GMT
- Title: Extracting Overlapping Microservices from Monolithic Code via Deep Semantic Embeddings and Graph Neural Network-Based Soft Clustering
- Authors: Morteza Ziabakhsh, Kiyan Rezaee, Sadegh Eskandari, Seyed Amir Hossein Tabatabaei, Mohammad M. Ghassemi,
- Abstract summary: Mo2oM is a framework that formulates microservice extraction as a soft clustering problem.<n>Mo2oM achieves improvements of up to 40.97% in structural modularity (balancing cohesion and coupling), 58% in inter-service call (communication overhead), and 26.16% in interface number.
- Score: 1.6580952309590864
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Modern software systems are increasingly shifting from monolithic architectures to microservices to enhance scalability, maintainability, and deployment flexibility. Existing microservice extraction methods typically rely on hard clustering, assigning each software component to a single microservice. This approach often increases inter-service coupling and reduces intra-service cohesion. We propose Mo2oM (Monolithic to Overlapping Microservices), a framework that formulates microservice extraction as a soft clustering problem, allowing components to belong probabilistically to multiple microservices. This approach is inspired by expert-driven decompositions, where practitioners intentionally replicate certain software components across services to reduce communication overhead. Mo2oM combines deep semantic embeddings with structural dependencies extracted from methodcall graphs to capture both functional and architectural relationships. A graph neural network-based soft clustering algorithm then generates the final set of microservices. We evaluate Mo2oM on four open-source monolithic benchmarks and compare it against eight state-of-the-art baselines. Our results demonstrate that Mo2oM achieves improvements of up to 40.97% in structural modularity (balancing cohesion and coupling), 58% in inter-service call percentage (communication overhead), 26.16% in interface number (modularity and decoupling), and 38.96% in non-extreme distribution (service size balance) across all benchmarks.
Related papers
- From Monolith to Microservices: A Comparative Evaluation of Decomposition Frameworks [1.516795490965608]
This work presents a unified evaluation of state-of-the-art microservice decomposition approaches spanning static, dynamic, and hybrid techniques.<n>We assess the decomposition quality across widely used benchmark systems (JPetStore, AcmeAir, DayTrader, and Plants) using Structural Modularity (SM), Interface Number(IFN), Inter-partition Communication (ICP), Non-Extreme Distribution (NED), and related indicators.<n>Findings indicate that the hierarchical clustering-based methods, particularly HDBScan, produce the most consistently balanced decompositions across benchmarks.
arXiv Detail & Related papers (2026-01-30T16:28:47Z) - FMIP: Joint Continuous-Integer Flow For Mixed-Integer Linear Programming [52.52020895303244]
Mixed-Integer Linear Programming (MILP) is a foundational tool for complex decision-making problems.<n>We propose Joint Continuous-Integer Flow for Mixed-Integer Linear Programming (FMIP), which is the first generative framework that models joint distribution of both integer and continuous variables for MILP solutions.<n>FMIP is fully compatible with arbitrary backbone networks and various downstream solvers, making it well-suited for a broad range of real-world MILP applications.
arXiv Detail & Related papers (2025-07-31T10:03:30Z) - Centrality Change Proneness: an Early Indicator of Microservice Architectural Degradation [48.55946052680251]
The study of temporal networks has emerged as a way to describe and analyze evolving networks.<n>Previous research has explored how software metrics such as size, complexity, and quality are related to microservice centrality.<n>This study investigates whether temporal centrality metrics can provide insight into the early detection of architectural degradation.
arXiv Detail & Related papers (2025-06-09T12:22:12Z) - MONO2REST: Identifying and Exposing Microservices: a Reusable RESTification Approach [0.7499722271664147]
Many organizations are pursuing the migration of legacy monolithic systems to an architectural style.<n>This process is challenging, risky, time-intensive, and prone to failure while several organizations lack necessary financial resources, time, or expertise to set up this migration process.<n>We propose exposing a legacy system as a microservice application without having to migrate it.
arXiv Detail & Related papers (2025-03-27T14:10:33Z) - Reinforced Model Merging [53.84354455400038]
We present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks.<n>By utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times.
arXiv Detail & Related papers (2025-03-27T08:52:41Z) - Network Centrality as a New Perspective on Microservice Architecture [48.55946052680251]
The adoption of Microservice Architecture has led to the identification of various patterns and anti-patterns, such as Nano/Mega/Hub services.<n>This study investigates whether centrality metrics (CMs) can provide new insights into MSA quality and facilitate the detection of architectural anti-patterns.
arXiv Detail & Related papers (2025-01-23T10:13:57Z) - Migration to Microservices: A Comparative Study of Decomposition
Strategies and Analysis Metrics [0.5076419064097734]
We present a novel clustering method to identify potential in a given monolithic application.
Our approach employs a density-based clustering algorithm considering static analysis, structural, and semantic relationships between classes.
arXiv Detail & Related papers (2024-02-13T14:15:00Z) - A Microservices Identification Method Based on Spectral Clustering for
Industrial Legacy Systems [5.255685751491305]
We propose an automated microservice decomposition method for extracting microservice candidates based on spectral graph theory.
We show that our method can yield favorable results even without the involvement of domain experts.
arXiv Detail & Related papers (2023-12-20T07:47:01Z) - From Kubernetes to Knactor: A Data-Centric Rethink of Service
Composition [5.250111701278031]
Microservices are increasingly used in modern applications, leading to a growing need for effective service composition solutions.
Traditional API-centric composition mechanisms introduce rigid code-level coupling, scatter logic, and visibility into cross-service data exchanges.
We propose a rethinking of service composition and present Knactor, a new data-centric framework to restore the modularity that were intended to offer.
arXiv Detail & Related papers (2023-09-04T20:46:05Z) - Reclaimer: A Reinforcement Learning Approach to Dynamic Resource
Allocation for Cloud Microservices [4.397680391942813]
We introduce Reclaimer, a deep learning model that adapts to changes in the number and behavior of runtime changes in order to minimize CPU core allocation while meeting requirements.
When evaluated with two micro-service-based applications, Reclaimer reduces the mean CPU core allocation by 38.4% to 74.4% relative to the industry-standard scaling solution.
arXiv Detail & Related papers (2023-04-17T01:44:05Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Clustered Federated Learning via Generalized Total Variation
Minimization [83.26141667853057]
We study optimization methods to train local (or personalized) models for local datasets with a decentralized network structure.
Our main conceptual contribution is to formulate federated learning as total variation minimization (GTV)
Our main algorithmic contribution is a fully decentralized federated learning algorithm.
arXiv Detail & Related papers (2021-05-26T18:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.