Cascaded Transformer for Robust and Scalable SLA Decomposition via Amortized Optimization
- URL: http://arxiv.org/abs/2601.11859v1
- Date: Sat, 17 Jan 2026 01:01:53 GMT
- Title: Cascaded Transformer for Robust and Scalable SLA Decomposition via Amortized Optimization
- Authors: Cyril Shih-Huan Hsu,
- Abstract summary: 6G networks increasingly rely on network slicing to provide tailored, End-to-End (E2E) logical networks over shared physical infrastructures.<n>Current solutions handle through computationally intensive, iterative optimization processes that incur substantial latency and complexity.<n>We introduce Casformer, a cascaded Transformer architecture designed for fast, optimization-free SLA decomposition.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The evolution toward 6G networks increasingly relies on network slicing to provide tailored, End-to-End (E2E) logical networks over shared physical infrastructures. A critical challenge is effectively decomposing E2E Service Level Agreements (SLAs) into domain-specific SLAs, which current solutions handle through computationally intensive, iterative optimization processes that incur substantial latency and complexity. To address this, we introduce Casformer, a cascaded Transformer architecture designed for fast, optimization-free SLA decomposition. Casformer leverages historical domain feedback encoded through domain-specific Transformer encoders in its first layer, and integrates cross-domain dependencies using a Transformer-based aggregator in its second layer. The model is trained under a learning paradigm inspired by Domain-Informed Neural Networks (DINNs), incorporating risk-informed modeling and amortized optimization to learn a stable, forward-only SLA decomposition policy. Extensive evaluations demonstrate that Casformer achieves improved SLA decomposition quality against state-of-the-art optimization-based frameworks, while exhibiting enhanced scalability and robustness under volatile and noisy network conditions. In addition, its forward-only design reduces runtime complexity and simplifies deployment and maintenance. These insights reveal the potential of combining amortized optimization with Transformer-based sequence modeling to advance network automation, providing a scalable and efficient solution suitable for real-time SLA management in advanced 5G-and-beyond network environments.
Related papers
- SpanNorm: Reconciling Training Stability and Performance in Deep Transformers [55.100133502295996]
We propose SpanNorm, a novel technique designed to resolve the dilemma by integrating the strengths of both paradigms.<n>We provide a theoretical analysis demonstrating that SpanNorm, combined with a principled scaling strategy, maintains bounded signal variance throughout the network.<n> Empirically, SpanNorm consistently outperforms standard normalization schemes in both dense and Mixture-of-Experts (MoE) scenarios.
arXiv Detail & Related papers (2026-01-30T05:21:57Z) - Rethinking Autoregressive Models for Lossless Image Compression via Hierarchical Parallelism and Progressive Adaptation [75.58269386927076]
Autoregressive (AR) models are often dismissed as impractical due to prohibitive computational cost.<n>This work re-thinks this paradigm, introducing a framework built on hierarchical parallelism and progressive adaptation.<n> Experiments on diverse datasets (natural, satellite, medical) validate that our method achieves new state-of-the-art compression.
arXiv Detail & Related papers (2025-11-14T06:27:58Z) - CollaPipe: Adaptive Segment-Optimized Pipeline Parallelism for Collaborative LLM Training in Heterogeneous Edge Networks [57.95170323315603]
We introduce CollaPipe, a distributed learning framework that integrates collaborative pipeline parallelism with federated aggregation to support self-evolving networks.<n>In CollaPipe, the encoder part is adaptively partitioned into variable-sized segments and deployed across mobile devices for pipeline-parallel training, while the decoder is deployed on edge servers to handle generative tasks.<n>To enhance training efficiency, we formulate a joint optimization problem that adaptively allocates model segments, micro-batches, bandwidth, and transmission power.
arXiv Detail & Related papers (2025-09-24T07:54:01Z) - Revisiting the Privacy Risks of Split Inference: A GAN-Based Data Reconstruction Attack via Progressive Feature Optimization [49.32786615205064]
Split Inference (SI) partitions computation between edge devices and the cloud to reduce latency and protect user privacy.<n>Recent advances in Data Reconstruction Attacks (DRAs) reveal that intermediate features exchanged in SI can be exploited to recover sensitive input data.<n>Existing DRAs are typically effective only on shallow models and fail to fully leverage semantic priors.<n>We propose a novel GAN-based DRA framework with Progressive Feature Optimization (PFO), which decomposes the generator into hierarchical blocks and incrementally refines intermediate representations to enhance the semantic fidelity of reconstructed images.
arXiv Detail & Related papers (2025-08-28T10:00:39Z) - Transformer-Empowered Actor-Critic Reinforcement Learning for Sequence-Aware Service Function Chain Partitioning [1.9120720496423733]
We introduce a Transformer-empowered actor-critic framework specifically designed for sequence-aware SFC partitioning.<n>Our approach effectively models complex inter-dependencies among VNFs, facilitating coordinated and parallelized decision-making processes.
arXiv Detail & Related papers (2025-04-26T12:18:57Z) - Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification [80.83325513157637]
Few-Shot Remote Sensing Scene Classification (FS-RSSC) presents the challenge of classifying remote sensing images with limited labeled samples.<n>We propose a novel Optimal Transport Adapter Tuning (OTAT) framework aimed at constructing an ideal Platonic representational space.
arXiv Detail & Related papers (2025-03-19T07:04:24Z) - LADs: Leveraging LLMs for AI-Driven DevOps [3.240228178267042]
LADs is a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions.<n>By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings.<n>Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios.
arXiv Detail & Related papers (2025-02-28T08:12:08Z) - Unifying Dimensions: A Linear Adaptive Approach to Lightweight Image Super-Resolution [6.857919231112562]
Window-based transformers have demonstrated outstanding performance in super-resolution tasks.
They exhibit higher computational complexity and inference latency than convolutional neural networks.
We construct a convolution-based Transformer framework named the linear adaptive mixer network (LAMNet)
arXiv Detail & Related papers (2024-09-26T07:24:09Z) - Online SLA Decomposition: Enabling Real-Time Adaptation to Evolving Network Systems [0.0]
This study investigates the dynamic nature of real-world systems and introduces an online learning-decomposition framework to tackle the dynamicity.<n>We propose a framework that continuously updates the risk models based on the most recent feedback.<n>Our empirical study on an analytic model-based simulator demonstrates that the proposed framework outperforms the state-of-the-art static approach.
arXiv Detail & Related papers (2024-08-16T18:34:11Z) - Efficient Parallel Split Learning over Resource-constrained Wireless
Edge Networks [44.37047471448793]
In this paper, we advocate the integration of edge computing paradigm and parallel split learning (PSL)
We propose an innovative PSL framework, namely, efficient parallel split learning (EPSL) to accelerate model training.
We show that the proposed EPSL framework significantly decreases the training latency needed to achieve a target accuracy.
arXiv Detail & Related papers (2023-03-26T16:09:48Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Robust Deep Compressive Sensing with Recurrent-Residual Structural
Constraints [0.0]
Existing deep sensing (CS) methods either ignore adaptive online optimization or depend on costly iterative reconstruction.
This work explores a novel image CS framework with recurrent-residual structural constraint, termed as R$2$CS-NET.
As the first deep CS framework efficiently bridging adaptive online optimization, the R$2$CS-NET integrates the robustness of online optimization with the efficiency and nonlinear capacity of deep learning methods.
arXiv Detail & Related papers (2022-07-15T05:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.