Related papers: Faster Exact MPE and Constrained Optimization with Deterministic Finite State Automata

Related papers

MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation [74.34220141721231]
We present MPQ-DMv2, an improved textbfMixed textbfPrecision textbfQuantization framework for extremely low-bit textbfDiffusion textbfModels.
arXiv Detail & Related papers (2025-07-06T08:16:50Z)
MAO: Efficient Model-Agnostic Optimization of Prompt Tuning for Vision-Language Models [37.85176585188362]
We propose Model-Agnostic Optimization (MAO) for prompt tuning. We introduce a Data-Driven Enhancement framework to optimize the distribution of the initial data. We incorporate an Alterable Regularization module to boost the task-specific feature processing pipeline.
arXiv Detail & Related papers (2025-03-23T17:59:33Z)
Performance-driven Constrained Optimal Auto-Tuner for MPC [36.143463447995536]
We propose COAT-MPC, Constrained Optimal Auto-Tuner for MPC. COAT-MPC gathers performance data and learns by updating its posterior belief. We theoretically analyze COAT-MPC, showing that it satisfies performance constraints with arbitrarily high probability.
arXiv Detail & Related papers (2025-03-10T09:56:08Z)
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization [61.492590008258986]
Large language models (LLMs) deliver impressive results but face challenges from increasing model sizes and computational costs. We propose DRPruning, which incorporates distributionally robust optimization to restore balanced performance across domains.
arXiv Detail & Related papers (2024-11-21T12:02:39Z)
Respecting the limit:Bayesian optimization with a bound on the optimal value [3.004066195320147]
We study the scenario that we have either exact knowledge of the minimum value or a, possibly, lower bound on its value. We present SlogGP, a new surrogate model that incorporates bound information and adapts the Expected Improvement (EI) acquisition function accordingly.
arXiv Detail & Related papers (2024-11-07T14:27:49Z)
Tender: Accelerating Large Language Models via Tensor Decomposition and Runtime Requantization [0.6445087473595953]
Large language models (LLMs) demonstrate outstanding performance in various tasks in machine learning. deploying LLM inference poses challenges due to the high compute and memory requirements. We present Tender, an algorithm-hardware co-design solution that enables efficient deployment of LLM inference at low precision.
arXiv Detail & Related papers (2024-06-16T09:51:55Z)
Submodular Framework for Structured-Sparse Optimal Transport [7.030105924295838]
Unbalanced optimal transport (UOT) has recently gained much attention due to its flexible framework for handling unnormalized measures and its robustness. In this work, we explore learning (structured) sparse transport plans in the UOT setting, i.e., transport plans have an upper bound on the number of non-sparse entries in each column. We propose novel sparsity-constrained UOT formulations building on the recently explored mean discrepancy based UOT.
arXiv Detail & Related papers (2024-06-07T13:11:04Z)
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data. However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z)
FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs [92.47146416628965]
FuzzyFlow is a fault localization and test case extraction framework designed to test program optimizations. We leverage dataflow program representations to capture a fully reproducible system state and area-of-effect for optimizations. To reduce testing time, we design an algorithm for minimizing test inputs, trading off memory for recomputation.
arXiv Detail & Related papers (2023-06-28T13:00:17Z)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z)
Self-Supervised Learning via Maximum Entropy Coding [57.56570417545023]
We propose Maximum Entropy Coding (MEC) as a principled objective that explicitly optimize on the structure of the representation. MEC learns a more generalizable representation than previous methods based on specific pretext tasks. It achieves state-of-the-art performance consistently on various downstream tasks, including not only ImageNet linear probe, but also semi-supervised classification, object detection, instance segmentation, and object tracking.
arXiv Detail & Related papers (2022-10-20T17:58:30Z)
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization [75.72231742114951]
Large-scale pre-trained sequence-to-sequence models like BART and T5 achieve state-of-the-art performance on many generative NLP tasks. These models pose a great challenge in resource-constrained scenarios owing to their large memory requirements and high latency. We propose to jointly distill and quantize the model, where knowledge is transferred from the full-precision teacher model to the quantized and distilled low-precision student model.
arXiv Detail & Related papers (2022-03-21T18:04:25Z)
Risk Guarantees for End-to-End Prediction and Optimization Processes [0.0]
We study conditions that allow us to explicitly describe how the prediction performance governs the optimization performance. We derive the exact theoretical relationship between prediction performance measured with the squared loss, as well as a class of symmetric loss functions, and the subsequent optimization performance.
arXiv Detail & Related papers (2020-12-30T05:20:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.