Related papers: Reducing Errors in Excel Models with Component-Based Software Engineering

Related papers

Benchmark Dataset Generation and Evaluation for Excel Formula Repair with LLMs [3.4697197968922566]
Large language models (LLMs) offer promising assistance by explaining formula errors.<n>This paper introduces a novel approach for constructing a benchmark dataset specifically designed for Excel formula repair.<n>Our pipeline integrates few-shot prompting with LLMs and employs a robust textitLLM-as-a-Judge validation framework.
arXiv Detail & Related papers (2025-08-14T16:43:35Z)
Collaborative LLM Inference via Planning for Efficient Reasoning [50.04696654679751]
We propose a test-time collaboration framework in which a planner model first generates a plan, defined as a distilled and high-level abstraction of the problem.<n>Small and large models take turns acting as planner and reasoner, exchanging plans in a multi-round cascade to collaboratively solve complex tasks.<n>Our method achieves accuracy comparable to strong proprietary models alone, while significantly reducing reliance on paid inference.
arXiv Detail & Related papers (2025-06-13T08:35:50Z)
Synthetic Function Demonstrations Improve Generation in Low-Resource Programming Languages [32.08109313615468]
We present novel approaches to the creation of such data for low resource programming languages. We generate fully-synthetic, textbook-quality demonstrations of common library functions in an example domain of Excel formulas. We show advantages of finetuning over standard, off-the-shelf RAG approaches, which can offer only modest improvement due to the unfamiliar target domain.
arXiv Detail & Related papers (2025-03-24T15:09:03Z)
Enabling Small Models for Zero-Shot Classification through Model Label Learning [50.68074833512999]
We introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap between models and their functionalities. Experiments on seven real-world datasets validate the effectiveness and efficiency of MLL.
arXiv Detail & Related papers (2024-08-21T09:08:26Z)
NL2Formula: Generating Spreadsheet Formulas from Natural Language Queries [29.33149993368329]
This paper introduces a novel benchmark task called NL2Formula. The aim is to generate executable formulas that are grounded on a spreadsheet table, given a Natural Language (NL) query as input. We construct a comprehensive dataset consisting of 70,799 paired NL queries and corresponding spreadsheet formulas, covering 21,670 tables and 37 types of formula functions.
arXiv Detail & Related papers (2024-02-20T05:58:05Z)
InstructExcel: A Benchmark for Natural Language Instruction in Excel [72.018640505825]
This work investigates whether Large Language Models can generate code that solves Excel specific tasks provided via natural language user instructions. Our benchmark includes over 10k samples covering 170+ Excel operations across 2,000 publicly available Excel spreadsheets. We observe that (1) using GPT-4 over GPT-3.5, (2) providing more in-context examples, and (3) dynamic prompting can help improve performance on this benchmark.
arXiv Detail & Related papers (2023-10-23T02:00:55Z)
Excel as a Turing-complete Functional Programming Environment [0.0]
The Excel calculation engine was the subject of a major upgrade to accommodate Dynamic Arrays in 2018. This paper will show the ad-hoc end user practices of traditional spreadsheets can be replaced by radically different approaches. It is too early to guess the extent to which the new functionality will be adopted by the business and engineering communities.
arXiv Detail & Related papers (2023-08-31T20:11:36Z)
FLAME: A small language model for spreadsheet formulas [25.667479554632735]
We present FLAME, a transformer-based model trained exclusively on Excel formulas. We use sketch deduplication, introduce an Excel-specific formula tokenizer, and use domain-specific versions of masked span prediction. We evaluate FLAME on formula repair, formula completion, and similarity-based formula retrieval.
arXiv Detail & Related papers (2023-01-31T17:29:43Z)
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization [99.6826401545377]
Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions. We propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks.
arXiv Detail & Related papers (2022-12-20T17:21:46Z)
Spreadsheet computing with Finite Domain Constraint Enhancements [0.0]
We present a framework seamlessly incorporating a finite constraint solver with the spreadsheet computing paradigm. The framework provides an interface for constraint solving and further enhances the spreadsheet computing paradigm.
arXiv Detail & Related papers (2022-02-22T17:50:48Z)
SpreadsheetCoder: Formula Prediction from Semi-structured Context [70.41579328458116]
We propose a BERT-based model architecture to represent the tabular context in both row-based and column-based formats. We train our model on a large dataset of spreadsheets, and demonstrate that SpreadsheetCoder achieves top-1 prediction accuracy of 42.51%. Compared to the rule-based system, SpreadsheetCoder 82% assists more users in composing formulas on Google Sheets.
arXiv Detail & Related papers (2021-06-26T11:26:27Z)
What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets. We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z)
Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
A Structured Approach to the development of Solutions in Excel [0.0]
This paper considers the use of controversial or lesser-used techniques to create a coherent solution strategy. The problem is solved by a sequence of formulas resembling the steps of a programmed language.
arXiv Detail & Related papers (2017-04-04T18:22:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.