Related papers: Phaedrus: Predicting Dynamic Application Behavior with Lightweight Generative Models and LLMs

Phaedrus: Predicting Dynamic Application Behavior with Lightweight Generative Models and LLMs

URL: http://arxiv.org/abs/2412.06994v3
Date: Mon, 13 Oct 2025 22:53:54 GMT
Title: Phaedrus: Predicting Dynamic Application Behavior with Lightweight Generative Models and LLMs
Authors: Bodhisatwa Chatterjee, Neeraj Jadhav, Santosh Pande,
Abstract summary: Phaedrus is a new textitcompiler-assisted deep learning framework designed to predict dynamic program behaviors across varied execution instances.<n>Our experiments show that textitPhaedrus can achieve upto $107X$ reduction in WPP function profile sizes.
Score: 1.696629478421498
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Application profiling is an indispensable technique for many software development tasks, such as code and memory layout optimizations, where optimization decisions are tailored to specific program profiles. Unfortunately, modern application codebases exhibit highly variant behavior across different inputs, creating challenges for conventional profiling approaches that rely on a single representative execution instance. In this paper, we propose \textbf{Phaedrus}, a new \textit{compiler-assisted deep learning framework} designed to predict dynamic program behaviors across varied execution instances, specifically focusing on dynamic function call prediction.Such predicted call sequences are then used for producing optimized code pertinent to a given input. Traditional profile-guided optimization methods struggle with the input-dependent variability of modern applications, where profiling on different inputs yields divergent application behaviors. To address this, Phaedrus proposes two new approaches: \textit{Application Behavior Synthesis}, a profile-less approach where Large Language Models (LLMs) directly infer dynamic functions based on source code \& static compiler analysis, bypassing the need for traditional profiling, and \textit{Application Profile Generalization}, which uses generative models trained on compressed and augmented \textit{Whole Program Path} (WPP) based function profiles to predict application behavior under unseen inputs. Our experiments show that \textit{Phaedrus} can achieve upto $10^7X$ reduction in WPP function profile sizes, can predict most frequently executed functions that cover upto 85-99\% of the execution time, along with an average of 13.19\% (upto 65\%) reduction in application binary size, and an average of 6.08\% (upto 20\%) performance improvement over the traditional profile-guided optimization, without any execution.

Related papers

Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z)
CURP: Codebook-based Continuous User Representation for Personalized Generation with LLMs [60.867541073274715]
We propose a novel framework CURP, which employs a bidirectional user encoder and a discrete prototype codebook to extract multi-dimensional user traits.<n>This design enables plug-and-play personalization with a small number of trainable parameters.<n>We show that CURP achieves superior performance and generalization compared to strong baselines.
arXiv Detail & Related papers (2026-01-31T14:13:06Z)
Behavioral Embeddings of Programs: A Quasi-Dynamic Approach for Optimization Prediction [35.89884852302035]
This paper proposes a novel quasi-dynamic framework for program representation.<n>The core insight is to model a program's optimization sensitivity.<n>To effectively encode this high-dimensional, continuous spectrum, we pioneer a compositional learning approach.
arXiv Detail & Related papers (2025-10-15T05:18:41Z)
Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework [60.26747209785186]
multimodal large language models (EMLLMs) reduce model size and computational costs and are often deployed on resource-constrained devices. Existing open-sourceLMs rarely have access to private domain-specific data during the pre-training process. We propose a tuntextbfunderlineIng-free, atextbfunderlineDaptivtextbfunderlineE, universtextbfunderlineAL textbfunderlinePrompt Optimization Framework.
arXiv Detail & Related papers (2024-12-27T15:21:17Z)
Tracing Optimization for Performance Modeling and Regression Detection [15.99435412859094]
A performance model analytically describes the relationship between the performance of a system and its runtime activities. We propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
arXiv Detail & Related papers (2024-11-26T16:11:55Z)
$f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization [54.94545757220999]
$f$-PO is a novel framework that generalizes and extends existing approaches. We conduct experiments on state-of-the-art language models using benchmark datasets.
arXiv Detail & Related papers (2024-10-29T02:11:45Z)
PAD: Personalized Alignment of LLMs at Decoding-Time [10.347782385286582]
This paper presents a novel framework designed to align LLM outputs with diverse personalized preferences during the inference phase. The Personalized Alignment at Decoding-time (PAD) framework decouples the text generation process from personalized preferences. PAD not only outperforms existing training-based alignment methods in terms of aligning with diverse preferences but also shows significant generalizability to preferences unseen during training.
arXiv Detail & Related papers (2024-10-05T08:00:55Z)
Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation. In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales. Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z)
Probabilistic Programming with Programmable Variational Inference [45.593974530502095]
We propose a more modular approach to supporting variational inference in PPLs, based on compositional program transformation. Our design enables modular reasoning about many interacting concerns, including automatic differentiation, density, tracing, and the application of unbiased gradient estimation strategies. We implement our approach in an extension to the Gen probabilistic programming system (genjax.vi), implemented in JAX, and evaluate on several deep generative modeling tasks.
arXiv Detail & Related papers (2024-06-22T05:49:37Z)
One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem. We do so by pre-optimizing a smooth and convex dual function that has a closed form. Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z)
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback [70.32795295142648]
Linear alignment is a novel algorithm that aligns language models with human preferences in one single inference step. Experiments on both general and personalized preference datasets demonstrate that linear alignment significantly enhances the performance and efficiency of LLM alignment.
arXiv Detail & Related papers (2024-01-21T10:46:23Z)
Aligning Large Language Models with Counterfactual DPO [1.8130068086063336]
This paper explores the utilization of counterfactual prompting to align the model's style without relying on human intervention. We demonstrate that this method effectively instils desirable behaviour, mitigates undesirable ones, and encourages the model to disregard inappropriate instructions.
arXiv Detail & Related papers (2024-01-17T19:43:43Z)
FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs [92.47146416628965]
FuzzyFlow is a fault localization and test case extraction framework designed to test program optimizations. We leverage dataflow program representations to capture a fully reproducible system state and area-of-effect for optimizations. To reduce testing time, we design an algorithm for minimizing test inputs, trading off memory for recomputation.
arXiv Detail & Related papers (2023-06-28T13:00:17Z)
ParaGraph: Weighted Graph Representation for Performance Optimization of HPC Kernels [1.304892050913381]
We introduce a new graph-based program representation for parallel applications that extends the Abstract Syntax Tree. We evaluate our proposed representation by training a Graph Neural Network (GNN) to predict the runtime of an OpenMP code region. Results show that our approach is indeed effective and has normalized RMSE as low as 0.004 to at most 0.01 in its runtime predictions.
arXiv Detail & Related papers (2023-04-07T05:52:59Z)
Numerical Methods for Convex Multistage Stochastic Optimization [86.45244607927732]
We focus on optimisation programming (SP), Optimal Control (SOC) and Decision Processes (MDP) Recent progress in solving convex multistage Markov problems is based on cutting planes approximations of the cost-to-go functions of dynamic programming equations. Cutting plane type methods can handle multistage problems with a large number of stages, but a relatively smaller number of state (decision) variables.
arXiv Detail & Related papers (2023-03-28T01:30:40Z)
Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training [17.556432199389615]
Slapo is a schedule language that decouples the execution of a tensor-level operator from its arithmetic definition. We show that Slapo can improve training throughput by up to 2.92x on a single machine with 8 NVIDIA V100 GPUs.
arXiv Detail & Related papers (2023-02-16T00:34:53Z)
Learning to compile smartly for program size reduction [35.58682369218936]
We propose a novel approach that learns a policy to select passes for program size reduction. Our approach uses a search mechanism that helps identify useful pass sequences and a GNN with customized attention that selects the optimal sequence to use.
arXiv Detail & Related papers (2023-01-09T16:42:35Z)
Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs [113.8752163061151]
We study episodic reinforcement learning (RL) in non-stationary linear kernel Markov decision processes (MDPs) We propose the underlineperiodically underlinerestarted underlineoptimistic underlinepolicy underlineoptimization algorithm (PROPO) PROPO features two mechanisms: sliding-window-based policy evaluation and periodic-restart-based policy improvement.
arXiv Detail & Related papers (2021-10-18T02:33:20Z)
Learning to Superoptimize Real-world Programs [79.4140991035247]
We propose a framework to learn to superoptimize real-world programs by using neural sequence-to-sequence models. We introduce the Big Assembly benchmark, a dataset consisting of over 25K real-world functions mined from open-source projects in x86-64 assembly.
arXiv Detail & Related papers (2021-09-28T05:33:21Z)
MLComp: A Methodology for Machine Learning-based Performance Estimation and Adaptive Selection of Pareto-Optimal Compiler Optimization Sequences [10.200899224740871]
We propose a novel Reinforcement Learning-based policy methodology for embedded software optimization. We show that different Machine Learning models are automatically tested to choose the best-fitting one. We also show that our framework can be trained efficiently for any target platform and application domain.
arXiv Detail & Related papers (2020-12-09T19:13:39Z)
Transforming Probabilistic Programs for Model Checking [0.0]
We apply static analysis to probabilistic programs to automate large parts of two crucial model checking methods. Our method transforms a probabilistic program specifying a density function into an efficient forward-sampling form. We present an implementation targeting the popular Stan probabilistic programming language.
arXiv Detail & Related papers (2020-08-21T21:06:34Z)
Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points. The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding. In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.