Related papers: Algorithmic Language Models with Neurally Compiled Libraries

Algorithmic Language Models with Neurally Compiled Libraries

URL: http://arxiv.org/abs/2407.04899v1
Date: Sat, 6 Jul 2024 00:27:05 GMT
Title: Algorithmic Language Models with Neurally Compiled Libraries
Authors: Lucas Saldyt, Subbarao Kambhampati,
Abstract summary: Large Language Models lack true algorithmic ability. Our paper proposes augmenting LLMs with a library of fundamental operations and sophisticated differentiable programs. We explore the feasability of augmenting LLaMA3 with a differentiable computer.
Score: 16.284360949127723
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Important tasks such as reasoning and planning are fundamentally algorithmic, meaning that solving them robustly requires acquiring true reasoning or planning algorithms, rather than shortcuts. Large Language Models lack true algorithmic ability primarily because of the limitations of neural network optimization algorithms, their optimization data and optimization objective, but also due to architectural inexpressivity. To solve this, our paper proposes augmenting LLMs with a library of fundamental operations and sophisticated differentiable programs, so that common algorithms do not need to be learned from scratch. We add memory, registers, basic operations, and adaptive recurrence to a transformer architecture built on LLaMA3. Then, we define a method for directly compiling algorithms into a differentiable starting library, which is used natively and propagates gradients for optimization. In this preliminary study, we explore the feasability of augmenting LLaMA3 with a differentiable computer, for instance by fine-tuning small transformers on simple algorithmic tasks with variable computational depth.

Related papers

Discovering Algorithms with Computational Language Processing [0.7062238472483737]
We present a framework automating algorithm discovery by conceptualizing them as sequences of operations, represented as tokens.<n>These computational tokens are chained using a grammar, enabling the formation of increasingly sophisticated procedures.<n>Our ensemble Monte Carlo tree search (MCTS) guided by reinforcement learning (RL) explores token chaining and drives the creation of new tokens.
arXiv Detail & Related papers (2025-07-03T21:45:17Z)
LOP: Learning Optimal Pruning for Efficient On-Demand MLLMs Scaling [52.1366057696919]
LOP is an efficient neural pruning framework that learns optimal pruning strategies from the target pruning constraint.<n>LOP approach trains autoregressive neural networks (NNs) to directly predict layer-wise pruning strategies adaptive to the target pruning constraint.<n> Experimental results show that LOP outperforms state-of-the-art pruning methods in various metrics while achieving up to three orders of magnitude speedup.
arXiv Detail & Related papers (2025-06-15T12:14:16Z)
TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree [52.44403214958304]
In this paper, we introduce TreeLoRA, a novel approach that constructs layer-wise adapters by leveraging hierarchical gradient similarity.<n>To reduce the computational burden of task similarity estimation, we employ bandit techniques to develop an algorithm based on lower confidence bounds.<n> experiments on both vision transformers (ViTs) and large language models (LLMs) demonstrate the effectiveness and efficiency of our approach.
arXiv Detail & Related papers (2025-06-12T05:25:35Z)
STRCMP: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization [18.162186876640764]
Combinatorial optimization (CO) problems, central to operation research and theoretical computer science, present significant computational challenges due to their NP-hard nature.<n>We propose STRCMP, a novel structure-aware algorithm discovery framework that systematically integrates structure priors to enhance solution quality and solving efficiency.<n>Our framework combines a graph neural network (GNN) for extracting structural embeddings from CO instances with an LLM conditioned on these embeddings to identify high-performing algorithms in the form of solver-specific codes.
arXiv Detail & Related papers (2025-05-22T15:37:42Z)
Combinatorial Optimization for All: Using LLMs to Aid Non-Experts in Improving Optimization Algorithms [0.9668407688201361]
Large Language Models (LLMs) have shown notable potential in code generation for optimization algorithms. This paper examines how LLMs, rather than creating algorithms from scratch, can improve existing ones without the need for specialized expertise.
arXiv Detail & Related papers (2025-03-14T00:26:00Z)
On the Design and Analysis of LLM-Based Algorithms [74.7126776018275]
Large language models (LLMs) are used as sub-routines in algorithms. LLMs have achieved remarkable empirical success. Our proposed framework holds promise for advancing LLM-based algorithms.
arXiv Detail & Related papers (2024-07-20T07:39:07Z)
Large Language Models As Evolution Strategies [6.873777465945062]
In this work, we investigate whether large language models (LLMs) are in principle capable of implementing evolutionary optimization algorithms. We introduce a novel prompting strategy, consisting of least-to-most sorting of discretized population members. We find that our setup allows the user to obtain an LLM-based evolution strategy, which we call EvoLLM', that robustly outperforms baseline algorithms.
arXiv Detail & Related papers (2024-02-28T15:02:17Z)
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark [166.40879020706151]
This paper proposes a shift towards BP-free, zeroth-order (ZO) optimization as a solution for reducing memory costs during fine-tuning. Unlike traditional ZO-SGD methods, our work expands the exploration to a wider array of ZO optimization techniques. Our study unveils previously overlooked optimization principles, highlighting the importance of task alignment, the role of the forward gradient method, and the balance between algorithm complexity and fine-tuning performance.
arXiv Detail & Related papers (2024-02-18T14:08:48Z)
Algorithm Evolution Using Large Language Model [18.03090066194074]
We propose a novel approach called Evolution Algorithm using Large Language Model (AEL) AEL does algorithm-level evolution without model training. Human effort and requirements for domain knowledge can be significantly reduced.
arXiv Detail & Related papers (2023-11-26T09:38:44Z)
Large Language Model-Enhanced Algorithm Selection: Towards Comprehensive Algorithm Representation [27.378185644892984]
This paper introduces Large Language Models (LLMs) into algorithm selection for the first time. LLMs not only captures the structural and semantic aspects of the algorithm, but also demonstrates contextual awareness and library function understanding. The selected algorithm is determined by the matching degree between a given problem and different algorithms.
arXiv Detail & Related papers (2023-11-22T06:23:18Z)
Use Your INSTINCT: INSTruction optimization for LLMs usIng Neural bandits Coupled with Transformers [66.823588073584]
Large language models (LLMs) have shown remarkable instruction-following capabilities and achieved impressive performances in various applications. Recent work has used the query-efficient Bayesian optimization (BO) algorithm to automatically optimize the instructions given to black-box LLMs. We propose a neural bandit algorithm which replaces the GP in BO by an NN surrogate to optimize instructions for black-box LLMs.
arXiv Detail & Related papers (2023-10-02T02:01:16Z)
Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms. The learned algorithms are domain-agnostic and can generalize to new environments not seen during training. We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Learning the Step-size Policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno Algorithm [3.7470451129384825]
We consider the problem of how to learn a step-size policy for the Limited-Memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm. We propose a neural network architecture with local information of the current gradient as the input. The step-length policy is learned from data of similar optimization problems, avoids additional evaluations of the objective function, and guarantees that the output step remains inside a pre-defined interval.
arXiv Detail & Related papers (2020-10-03T09:34:03Z)
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives. We develop novel data reuse analysis algorithms using the polyhedral model. We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.