Related papers: XTC, A Research Platform for Optimizing AI Workload Operators

XTC, A Research Platform for Optimizing AI Workload Operators

URL: http://arxiv.org/abs/2512.16512v1
Date: Thu, 18 Dec 2025 13:24:44 GMT
Title: XTC, A Research Platform for Optimizing AI Workload Operators
Authors: Pompougnac Hugo, Guillon Christophe, Noiry Sylvain, Dutilleul Alban, Iooss Guillaume, Rastello Fabrice,
Abstract summary: We introduce XTC, a platform that unifies scheduling and performance evaluation across compilers.<n>With its common API and reproducible measurement framework, XTC enables portable experimentation and accelerates research on optimization strategies.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Achieving high efficiency on AI operators demands precise control over computation and data movement. However, existing scheduling languages are locked into specific compiler ecosystems, preventing fair comparison, reuse, and evaluation across frameworks. No unified interface currently decouples scheduling specification from code generation and measurement. We introduce XTC, a platform that unifies scheduling and performance evaluation across compilers. With its common API and reproducible measurement framework, XTC enables portable experimentation and accelerates research on optimization strategies.

Related papers

Easy Data Unlearning Bench [53.1304932656586]
We introduce a unified and benchmarking suite that simplifies the evaluation of unlearning algorithms.<n>By standardizing setup and metrics, it enables reproducible, scalable, and fair comparison across unlearning methods.
arXiv Detail & Related papers (2026-02-18T12:20:32Z)
An LLVM-Based Optimization Pipeline for SPDZ [0.0]
We implement a proof-of-concept LLVM-based optimization pipeline for the SPDZ protocol.<n>Our front end accepts a subset of C with lightweight privacy annotations and lowers it to LLVM IR.<n>Our back end performs data-flow and control-flow analysis on the optimized IR to drive a non-blocking runtime scheduler.
arXiv Detail & Related papers (2025-12-11T20:53:35Z)
Understanding Accelerator Compilers via Performance Profiling [1.1841612917872066]
Accelerator design languages (ADLs) are high-level languages that compile to hardware units.<n>We introduce Petal, a cycle-level tool for understanding how the compiler's decisions affect performance.<n>We show that Petal's cycle-level profiles can identify performance problems in existing designs.
arXiv Detail & Related papers (2025-11-24T22:40:11Z)
The Fast for the Curious: How to accelerate fault-tolerant quantum applications [101.46859364118622]
We evaluate strategies for reducing the run time of fault-tolerant quantum computations.<n>We discuss how the co-design of hardware, fault tolerance, and algorithmic subroutines can reduce run times.
arXiv Detail & Related papers (2025-10-30T02:27:55Z)
Metrics and evaluations for computational and sustainable AI efficiency [26.52588349722099]
Current approaches fail to provide a holistic view, making it difficult to compare and optimise systems.<n>We propose a unified and reproducible methodology for AI model inference that integrates computational and environmental metrics.<n>Our framework provides pragmatic, carbon-aware evaluation by systematically measuring latency and distributions throughput, energy consumption, and location-adjusted carbon emissions.
arXiv Detail & Related papers (2025-10-18T03:30:15Z)
ParaCook: On Time-Efficient Planning for Multi-Agent Systems [62.471032881396496]
Large Language Models (LLMs) exhibit strong reasoning abilities for planning long-horizon, real-world tasks.<n>We present ParaCook, a benchmark for time-efficient collaborative planning.
arXiv Detail & Related papers (2025-10-13T16:47:07Z)
Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems [5.241450170761232]
This work presents a comprehensive evaluation of neural network graph compilers across heterogeneous hardware platforms.<n>Our systematic analysis reveals that graph compilers exhibit performance patterns highly dependent on both neural architecture and batch sizes.<n>We introduce novel metrics to quantify a compiler's ability to mitigate performance friction as batch size increases.
arXiv Detail & Related papers (2025-04-28T19:02:16Z)
CompilerDream: Learning a Compiler World Model for General Code Optimization [58.87557583347996]
We introduce CompilerDream, a model-based reinforcement learning approach to general code optimization.<n>It comprises a compiler world model that accurately simulates the intrinsic properties of optimization passes and an agent trained on this model to produce effective optimization strategies.<n>It excels across diverse datasets, surpassing LLVM's built-in optimizations and other state-of-the-art methods in both settings of value prediction and end-to-end code optimization.
arXiv Detail & Related papers (2024-04-24T09:20:33Z)
Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures [67.47328776279204]
This work introduces a framework to develop efficient, portable Deep Learning and High Performance Computing kernels. We decompose the kernel development in two steps: 1) Expressing the computational core using Processing Primitives (TPPs) and 2) Expressing the logical loops around TPPs in a high-level, declarative fashion. We demonstrate the efficacy of our approach using standalone kernels and end-to-end workloads that outperform state-of-the-art implementations on diverse CPU platforms.
arXiv Detail & Related papers (2023-04-25T05:04:44Z)
Learning Performance-Improving Code Edits [107.21538852090208]
We introduce a framework for adapting large language models (LLMs) to high-level program optimization. First, we curate a dataset of performance-improving edits made by human programmers of over 77,000 competitive C++ programming submission pairs. For prompting, we propose retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.
arXiv Detail & Related papers (2023-02-15T18:59:21Z)
Robust Scheduling with GFlowNets [6.6908747077585105]
We propose a new approach to scheduling by sampling proportionally to the proxy metric using a novel GFlowNet method. We introduce a technique to control the trade-off between diversity and goodness of the proposed schedules at inference time.
arXiv Detail & Related papers (2023-01-17T18:59:15Z)
Towards Optimal VPU Compiler Cost Modeling by using Neural Networks to Infer Hardware Performances [58.720142291102135]
'VPUNN' is a neural network-based cost model trained on low-level task profiling. It consistently outperforms the state-of-the-art cost modeling in Intel's line of VPU processors.
arXiv Detail & Related papers (2022-05-09T22:48:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.