Related papers: LLM Flow Processes for Text-Conditioned Regression

LLM Flow Processes for Text-Conditioned Regression

URL: http://arxiv.org/abs/2601.06147v1
Date: Mon, 05 Jan 2026 21:20:38 GMT
Title: LLM Flow Processes for Text-Conditioned Regression
Authors: Felix Biggs, Samuel Willis,
Abstract summary: Large Language Models (LLMs) are trained on giant corpora including varied real-world regression datasets alongside descriptions and metadata.<n>Recent work has extended this to regression tasks and is able to leverage such prior knowledge and metadata, achieving surprisingly good performance.<n>Here we introduce a general method for sampling from a product-of-experts of a diffusion or flow matching model and an expert' with binned probability density.
Score: 4.196805115026664
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Meta-learning methods for regression like Neural (Diffusion) Processes achieve impressive results, but with these models it can be difficult to incorporate expert prior knowledge and information contained in metadata. Large Language Models (LLMs) are trained on giant corpora including varied real-world regression datasets alongside their descriptions and metadata, leading to impressive performance on a range of downstream tasks. Recent work has extended this to regression tasks and is able to leverage such prior knowledge and metadata, achieving surprisingly good performance, but this still rarely matches dedicated meta-learning methods. Here we introduce a general method for sampling from a product-of-experts of a diffusion or flow matching model and an `expert' with binned probability density; we apply this to combine neural diffusion processes with LLM token probabilities for regression (which may incorporate textual knowledge), exceeding the empirical performance of either alone.

Related papers

What Language Models Know But Don't Say: Non-Generative Prior Extraction for Generalization [5.663538370244175]
We propose LoID, a deterministic method for extracting informative prior distributions for Bayesian logistic regression.<n>Rather than relying on generated text, we probe the model's confidence in opposing semantic directions through carefully constructed sentences.<n>We evaluate LoID on ten real-world datasets under synthetic out-of-distribution (OOD) settings.
arXiv Detail & Related papers (2026-01-24T22:05:01Z)
Large Language Models as Universal Predictors? An Empirical Study on Small Tabular Datasets [0.0]
Large Language Models (LLMs) can perform predictive tasks over structured inputs without explicit fine-tuning on downstream tasks.<n>We investigate the empirical function approximation capability of LLMs on small-scale structured datasets for classification, regression and clustering tasks.<n>Our findings suggest that LLMs can serve as general-purpose predictive engines for structured data, with clear strengths in classification and significant limitations in regression and clustering.
arXiv Detail & Related papers (2025-08-24T15:00:51Z)
Method-Based Reasoning for Large Language Models: Extraction, Reuse, and Continuous Improvement [0.3807314298073301]
We propose a method-based model that enhances large language models (LLMs) with explicit, reusable procedures extracted from training content, generated responses, and user interactions.<n>Our model enables continual learning, method reuse, and logical consistency beyond next-token prediction.
arXiv Detail & Related papers (2025-08-06T10:26:52Z)
DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks [40.91931801667421]
Our paper presents a novel global-to-local algorithm that interleaves influence function as a data selection method with Bayesian optimization to optimize data mixture via feedback from a specific unseen evaluation task.<n>By analyzing DUET's cumulative regret, we theoretically show that DUET converges to the optimal training data mixture for an unseen task even without any data knowledge of the task.
arXiv Detail & Related papers (2025-02-01T01:52:32Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data.<n>We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z)
In-Context Symbolic Regression: Leveraging Large Language Models for Function Discovery [5.2387832710686695]
In this work, we introduce the first comprehensive framework that utilizes Large Language Models (LLMs) for the task of Symbolic Regression. We propose In-Context Symbolic Regression (ICSR), an SR method which iteratively refines a functional form with an external LLM and determines its coefficients with an external LLM. Our findings reveal that LLMs are able to successfully find symbolic equations that fit the given data, matching or outperforming the overall performance of the best SR baselines on four popular benchmarks.
arXiv Detail & Related papers (2024-04-29T20:19:25Z)
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data. Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets. We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z)
Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries. Experimental results show that our method improves consistently over existing methods. Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Generative Meta-Learning for Zero-Shot Relation Triplet Extraction [20.556880137419064]
Zero-shot Relation Triplet Extraction (ZeroRTE) aims to extract relation triplets from texts containing unseen relation types.<n>Existing approaches typically leverage the knowledge embedded in pre-trained language models to accomplish the generalization process.<n>We propose a generative meta-learning framework which exploits the learning-to-learn' ability of meta-learning to boost the generalization capability of generative models.
arXiv Detail & Related papers (2023-05-03T06:34:39Z)
Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation. A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.